This is a simple question I think, but I have had trouble getting it to work.
I have a dataframe with an id variable (unique row), and series of columns with binary (0,1) results.
# ID Var1 Var2 Var3 Var4 Var5
# 1 0 0 0 1 0
# 2 1 0 0 0 0
# 3 0 1 0 0 0
I have a list of variable "classes"
ClassList = list(Class1 = c("Var1", "Var2"), Class2 = c("Var3", "Var4","Var5")
I would like to generate a new variable Class1 = 1 if Var1 | Var2 = 1, or 0 otherwise.
I can do this using less elegant means, but would like to make a more automated function/loop/apply that will create ClassVar, search ClassList and recode appropriately to generate following:
# ID Var1 Var2 Var3 Var4 Var5 Class1 Class2
# 1 0 0 0 1 0 0 1
# 2 1 0 0 0 0 1 0
# 3 0 1 0 0 0 1 0
There are a lot of Var's and Classe's to consolidate so doing brute force with if_else will not be efficient. Any suggestions?
The part I did so far is to generate the class variable:
for (I in 1:length(ClassList)) {
classname <- names(ClassList)[I]
df[,paste0(classname)] <- NA
}
Here is a base R method with lapply
, max.col
, and matrix subsetting:
df[names(ClassList)] <- lapply(ClassList,
function(i) df[i][cbind(seq_len(nrow(df)), max.col(df[i]))])
which returns
df
ID Var1 Var2 Var3 Var4 Var5 Class1 Class2
1 1 0 0 0 1 0 0 1
2 2 1 0 0 0 0 1 0
3 3 0 1 0 0 0 1 0
Here, lapply
applies max.col
to each element of ClassList
, which returns the positions of the columns with the maximum value for each row. I cbind these column positions to the row position to return a matrix that indicates the position of the maximum elements of each row. These are then extracted from the data.frame and returned as a vector. df[i]
is used to subset the data.frame to those columns contained in each element of ClassList.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.