简体   繁体   中英

Sum rows of each unique combination of variables in r

I want to create new variables that are the sum of each unique combination of 3 of the original variables.

Example of data:

df1 <- data.frame(A=c(1,2,3,5.5,5), B=c(2,2,2,2,0.5), C=c(1.5,0,0,2.1,3),    D=c(0.2,1,2,1,0.8), E=c(0.4,0.6,0.2,1.1,2))

    A   B   C   D   E
1 1.0 2.0 1.5 0.2 0.4
2 2.0 2.0 0.0 1.0 0.6
3 3.0 2.0 0.0 2.0 0.2
4 5.5 2.0 2.1 1.0 1.1
5 5.0 0.5 3.0 0.8 2.0

I would like to create new columns using each unique combination of 3 variables. Eg new columns called 'sum1' that combines columns A,B,C, 'sum2' combining A,B,D, 'sum3' combining A,B,E etc etc.

   A   B   C   D   E   sum1 sum2 sum3
1 1.0 2.0 1.5 0.2 0.4  3.5  3.2  3.4
2 2.0 2.0 0.0 1.0 0.6  4.0  5.0  4.6
3 3.0 2.0 0.0 2.0 0.2  5.0  7.0  5.2
4 5.5 2.0 2.1 1.0 1.1  9.6  8.5  8.6
5 5.0 0.5 3.0 0.8 2.0  8.5  6.3  7.5

From other questions I've figured out that this will select the unique combinations:

output <- combn(ncol(df1), 3, FUN = function(x) df1[x], simplify = FALSE)

This gives me a list of 10 (the number of all combinations), and I can view each group of variables selected using output[[1]], output[[2]] etc, but how do I then sum the rows of each and get them into a data frame?

Thank you

We can do a rowSums and convert to data.frame , set the names of the 'output' and cbind with the original dataset.

output <- as.data.frame(combn(ncol(df1), 3, FUN =function(x) rowSums(df1[x])))
names(output) <- paste0("sum_", combn(names(df1), 3, FUN = paste, collapse="_"))
cbind(df1, output)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM