简体   繁体   中英

Generate new calculated variable with automated search of list

This is a simple question I think, but I have had trouble getting it to work.

I have a dataframe with an id variable (unique row), and series of columns with binary (0,1) results.

# ID Var1 Var2 Var3 Var4 Var5
# 1  0    0    0    1    0
# 2  1    0    0    0    0
# 3  0    1    0    0    0

I have a list of variable "classes"

ClassList = list(Class1 = c("Var1", "Var2"), Class2 = c("Var3", "Var4","Var5") 

I would like to generate a new variable Class1 = 1 if Var1 | Var2 = 1, or 0 otherwise.

I can do this using less elegant means, but would like to make a more automated function/loop/apply that will create ClassVar, search ClassList and recode appropriately to generate following:

# ID Var1 Var2 Var3 Var4 Var5 Class1 Class2
# 1  0    0    0    1    0    0      1
# 2  1    0    0    0    0    1      0
# 3  0    1    0    0    0    1      0

There are a lot of Var's and Classe's to consolidate so doing brute force with if_else will not be efficient. Any suggestions?

The part I did so far is to generate the class variable:

for (I in 1:length(ClassList)) {
 classname <- names(ClassList)[I] 
 df[,paste0(classname)] <- NA
 }

Here is a base R method with lapply , max.col , and matrix subsetting:

df[names(ClassList)] <- lapply(ClassList,
                               function(i) df[i][cbind(seq_len(nrow(df)), max.col(df[i]))])

which returns

df
  ID Var1 Var2 Var3 Var4 Var5 Class1 Class2
1  1    0    0    0    1    0      0      1
2  2    1    0    0    0    0      1      0
3  3    0    1    0    0    0      1      0

Here, lapply applies max.col to each element of ClassList , which returns the positions of the columns with the maximum value for each row. I cbind these column positions to the row position to return a matrix that indicates the position of the maximum elements of each row. These are then extracted from the data.frame and returned as a vector. df[i] is used to subset the data.frame to those columns contained in each element of ClassList.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM