简体   繁体   中英

Merge columns within a dataframe that is very wide in R

Suppose I have a data frame such like:

set.seed(123)
df<-data.frame(ID=letters[1:10],
             A1=rnorm(10),A2=rnorm(10,1,3),
             B1=rnorm(10),B2=rnorm(10,1,3),
             C1=rnorm(10),C2=rnorm(10,1,3),
             D1=rnorm(10),D2=rnorm(10,1,3),
             E1=rnorm(10),E2=rnorm(10,1,3))

What I wanted is to add up *1 and *2 columns from AE. It seems easy to realize but the key issue for my case is the df is very wide and I prefer to not type AE with a desired output with column names c("ID", "A", "B", "C", "D", "E") .

We can split the column names of 'df' except the first one ('ID') by the substring of the column names created by removing the numeric part (using sub ), loop through the list elements with sapply , subset the columns of 'df' based on the index, get the rowSums and cbind with the first column ( df[1] )

cbind(df[1], sapply(split(names(df)[-1], sub("\\d+", "", names(df)[-1])),
                              function(x) rowSums(df[x])))
# ID          A          B          C           D           E
#1   a  4.1117697  1.2115690  1.0652486 -0.09345402  3.98627575
#2   b  1.8492640 -0.1031894  0.7064425 -6.42983008  3.03047128
#3   c  3.7610227  2.6593725 -0.3940077  3.68400819  1.34553517
#4   d  1.4025565  2.9055092  7.2747628 -2.14617767 -0.23934168
#5   e -0.5382357  2.8397040  1.5306490 -2.13581708  4.86147078
#6   f  8.0758044  1.3792275  4.4263032  4.38024275 -0.46899680
#7   g  2.9544676  3.4995400 -4.0491432  0.59389076  8.65883799
#8   h -6.1649127  0.9676380  2.2871859 -2.60914891  6.03301337
#9   i  2.4172149 -1.0560249  2.1515278  2.46617791 -0.03303266
#10  j -0.8640362  1.1124019  1.5644556  2.63341060 -0.93045508

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM