Lets consider a simple data frame as follows :
id area1feature1 area1feature2 area2feature1 area2feature2
1 1 2 3 4
2 3 6 1 5
Now I would like to combine feature1
for all areas, feature2
for all areas and so on, and then create a new sumOfFeature1
, sumOfFeature2
, etc.
So the expected output is something like this :
id area1feature1 area1feature2 area2feature1 area2feature2 sumOfFeature1 sumOfFeature2
1 1 2 3 4 4 6
2 3 6 1 5 4 11
How can I match columns based on sub-string and then combine them to create new columns for data frame?
The way I did it is as follows : Let input
be the data frame.
features_to_be_combined <- c('feature1', 'feature2')
locations <- sapply(features_to_be_combined, grep, colnames(input))
feature1_locations <- locations[, 'feature1']
sumOfFeature1 <- rep(0, dim(input)[1])
for (i in 1:length(feature1_locations)) {
sumOfFeature1 <- sumOfFeature1 + input[, feature1_locations[i]]
}
Now all that remains is to repeat the same procedure for feature2
and then add newly created features, namely sumOfFeature1
and sumOfFeature2
, to the input
data frame. I am sure there will a better way to do this (may be using apply
again on combined features), but this worked for me as expected.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.