简体   繁体   中英

Sum of columns with partial string match, with names imported from a list. R

I'm working with data tables like so:

ID <- c("ChrM","ChrM","ChrM","ChrM","ChrM")
pos <- c(5,6,7,10,11)
cr.H.MN.8A <- c(0,0,3,0,1)
cr.H.MN.8B <- c(2,1,0,1,0)
cr.H.MR.8A<-c(0,0,0,0,1)
cr.H.MR.8B<-c(0,2,0,0,1)
cr.H.MR2.8A<-c(0,1,1,0,0)
cr.H.MR2.8B<-c(0,1,0,0,0)
dfa <- data.table(ID,pos,cr.H.MN.8A,cr.H.MN.8B,cr.H.MR.8A,cr.H.MR.8B,cr.H.MR2.8A,cr.H.MR2.8B)



dfa
         ID pos cr.H.MN.8A cr.H.MN.8B cr.H.MR.8A cr.H.MR.8B cr.H.MR2.8A cr.H.MR2.8B
    1: ChrM   5          0          2          0          0           0           0
    2: ChrM   6          0          1          0          2           1           1
    3: ChrM   7          3          0          0          0           1           0
    4: ChrM  10          0          1          0          0           0           0
    5: ChrM  11          1          0          1          1           0           0
    > 

Now, on an earlier question I asked how I could use the similiraties on the name strings to add them up

dfa[,`H.MN.8` := as.numeric(rowSums(.SD) > 1), .SDcols = grep("cr.H.MN.8", names(dfa))]

Which worlks flawlessly. But I got another question; is it possible to auto assign the parameters needed to separate the string? I've got the names of each column on a list (they used to be different data.frames), so could I use it to replace the HM.MN.8 and thus make it more easily computed? Thanks!

The names are so:

names(dfb)<-c("cr.H.MN.8A","cr.H.MN.8B","cr.H.MR.8A","cr.H.MR.8B","cr.H.MR2.8A","cr.H.MR2.8B")

you can try using

varName <- c("H.MR2","H.MN.8")
eval(parse(text=paste0("dfa[,`",varName,"` := as.numeric(rowSums(.SD) > 1), .SDcols = grep('",varName,"', names(dfa))]")))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM