简体   繁体   中英

Combine two identical dataframe columns into comma seperated columns in R

I have two identically structured dataframe (same amount of rows, columns and same headers). What I would like to do is to combine the two into one dataframe that has comma seperated columns.

I know how to do it with this dummy data frames, but using it on my own data would be very cumbersome.

This are my dummy data frames, the headers of my "real" data are "1","2","3" etc. while those of the dummy data frames are "X1","X2","X3" etc.

> data1
  X1 X2 X3 X4
1  1  2  3  4
2  2  3  4  5
3  3  4  5  6
> data2
  X1 X2 X3 X4
1  8  9 13 14
2  9 10 14 15
3 10 11 15 16

What I would like:

>data3
   new1 new2 new3 new4
 1  1,8  2,9 3,13 4,14
 2  2,9 3,10 4,14 5,15
 3 3,10 4,11 5,15 6,16

How I managed to get this output. But, it is too cumbersome for a large dataset I think.:

data1<- data.frame('1'=1:3, '2'=2:4, '3'=3:5,'4'=4:6)
data2<- data.frame('1'=8:10, '2'=9:11, '3'=13:15,'4'=14:16)
names(data1) <- c("1a","2a","3a","4a")
names(data2) <- c("1b","2b","3b","4b")

data3<- cbind(data1,data2)

cols.1 <- c('1a','1b'); cols.2 <-c('2a','2b')
cols.3 <- c('3a','3b'); cols.4 <-c('4a','4b')

data3$new1 <- apply( data3[ , cols.1] , 1 , paste , collapse = "," )
data3$new2 <- apply( data3[ , cols.2] , 1 , paste , collapse = "," )
data3$new3 <- apply( data3[ , cols.3] , 1 , paste , collapse = "," )
data3$new4 <- apply( data3[ , cols.4] , 1 , paste , collapse = "," )

data3 <-data3[,c(9:12)]

Is there a way in which I can iterate this, perhaps with a for loop? Any help would be appreciated.

These posts are somehow similar:

Same question but for rows in stead of columns: how to convert column values into comma seperated row vlaues

Similar, but didn't work on my large dataset: Paste multiple columns together

using only base:

data1 <- data.frame(x1 = 1:3, x2 = 2:4, x3 = 3:5, x4 = 4:6)
data2 <- data.frame(x1 = 8:10, x2 = 9:11, x3 = 13:15, x4 = 14:16)

data3 <- mapply(function(x, y){paste(x,y, sep = ",")}, data1, data2)
data3 <- as.data.frame(data3)

    x1   x2   x3   x4
1  1,8  2,9 3,13 4,14
2  2,9 3,10 4,14 5,15
3 3,10 4,11 5,15 6,16

Here's a basic for loop approach:

newdf = data.frame(matrix(ncol=ncol(data1),nrow=nrow(data1)))

for (i in 1:ncol(data1)) {
  newdf[,i] = paste(data1[,i], data2[,i], sep=",")
}

#> newdf
#     X1   X2  X3   X4
# 1   1,8  2,9 3,13 4,14
# 2   2,9 3,10 4,14 5,15
# 3  3,10 4,11 5,15 6,16

Line by line explanation:

initialize new empty dataframe of appropriate dimensions:

newdf = data.frame(matrix(ncol=ncol(data1),nrow=nrow(data1)))

loop through 1,2,..n columns and fill each column with the paste results:

for (i in 1:ncol(data1)) {
  newdf[,i] = paste(data1[,i], data2[,i], sep=",")
}

Disclaimer that this may be very slow on large datasets - a dplyr or data.frame approach (and perhaps some v/s/apply*() statement) will be faster, if you are interested in learning those methods.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM