简体   繁体   中英

How to add two columns in a data frame to create a new third column, based on sub-strings of column names, in R?

Lets consider a simple data frame as follows :

id area1feature1 area1feature2 area2feature1 area2feature2
1  1             2             3             4
2  3             6             1             5

Now I would like to combine feature1 for all areas, feature2 for all areas and so on, and then create a new sumOfFeature1 , sumOfFeature2 , etc.

So the expected output is something like this :

id area1feature1 area1feature2 area2feature1 area2feature2 sumOfFeature1 sumOfFeature2
1  1             2             3             4             4             6
2  3             6             1             5             4             11

How can I match columns based on sub-string and then combine them to create new columns for data frame?

The way I did it is as follows : Let input be the data frame.

features_to_be_combined <- c('feature1', 'feature2')
locations <- sapply(features_to_be_combined, grep, colnames(input))
feature1_locations <- locations[, 'feature1']
sumOfFeature1 <- rep(0, dim(input)[1])
for (i in 1:length(feature1_locations)) {
    sumOfFeature1 <- sumOfFeature1 + input[, feature1_locations[i]]
}

Now all that remains is to repeat the same procedure for feature2 and then add newly created features, namely sumOfFeature1 and sumOfFeature2 , to the input data frame. I am sure there will a better way to do this (may be using apply again on combined features), but this worked for me as expected.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM