简体   繁体   中英

Merge csv files in R

I have 3 .csv files that I need to analyse in R. File one contains columns with user id and signupdate. File two contains columns with user id, purchase date and amount of purchases. File three contains columns with user id, message date and number of messages.

Please note that the order of the user id is not the same in each of the three files, thus cop.

Would love some help merging these files so that the large dataset has order user id, signupdate, purchase date, amount of purchases, message date and number of messages. Can't seem to find code to do this in R Thanks in advance

While merge doesn't take three arguments, Reduce is made for the task of iterating over a list and passing pairs to a function. Here's an example of a three-way merge:

d1 <- data.frame(id=letters[1:3], x=2:4)
d2 <- data.frame(id=letters[3:1], y=5:7)
d3 <- data.frame(id=c('b', 'c', 'a'), z=c(5,6,8))
Reduce(merge, list(d1, d2, d3))
##   id x y z
## 1  a 2 7 8
## 2  b 3 6 5
## 3  c 4 5 6

Note that the order of the column id is not the same, but the values are match ed.

In the case where you have non-matching data and want to keep all possible rows, you need an outer join, by supplying all=TRUE to merge . As Reduce does not have a way to pass additional arguments to the function, a new function must be created to call merge :

d1 <- data.frame(id=letters[1:3], x=2:4)
d2 <- data.frame(id=letters[3:1], y=5:7)
d3 <- data.frame(id=c('b', 'c', 'd'), z=c(5,6,8))
Reduce(function(x,y) merge(x,y,all=TRUE), list(d1, d2, d3))
##   id  x  y  z
## 1  a  2  7 NA
## 2  b  3  6  5
## 3  c  4  5  6
## 4  d NA NA  8

NA is used to indicate data in non-matched rows.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM