[英]comparing multiple columns including NAs in a dataframe R
I have a dataframe include 1,2 and bunch of NAs我有一个数据框,包括 1,2 和一堆 NA
I would like to compare these columns and save the results in a new column (let's say F) so that: if in each row all values are 1 then the new column get 1 for the same row if all values are 2, then assign 2 for the same row in new column if numbers are different (combination of 1 and 2) assign new number like 3 in the new我想比较这些列并将结果保存在新列中(假设为 F),以便:如果在每一行中所有值都是 1,那么如果所有值都是 2,那么新列为同一行获得 1,然后分配 2对于新列中的同一行,如果数字不同(1 和 2 的组合),则在新列中分配新数字,如 3
Do you have any idea how is it possible to doing so?你知道怎么可能这样做吗?
Try this.尝试这个。 If the variance is 0 then the new column equals the mean.如果方差为 0,则新列等于平均值。 If it is not then the new column equals the sum of the unique values (2,2,4=6).如果不是,则新列等于唯一值的总和 (2,2,4=6)。 If there is only one non-NA value in a row variance will not work so the first "if" statement takes care of that.如果一行中只有一个非 NA 值,则方差将不起作用,因此第一个“if”语句会处理该问题。
df <- as.data.frame(matrix(c(1, 1, 1, NA, NA, NA, 2, 2, NA, NA, NA, NA, 3, 2, 1, 2,
NA, NA, 4, 2, NA, 2, NA, NA, 5, 1, NA, NA, NA, NA),
ncol=5, byrow=T))
colnames(df) <- c("A", "B", "C", "D", "F")
for (i in 1:nrow(df)) {
if (length(as.numeric(df[i, 1:5])[!is.na(as.numeric(df[i, 1:5]))]) == 1) {
df[i, "col3"] <- as.numeric(df[i, 1:5])[!is.na(as.numeric(df[i, 1:5]))]
}
else if (var(as.numeric(df[i, 1:5]), na.rm=T)==0) {
df[i, "col3"] <- mean(as.numeric(df[i, 1:5]), na.rm=T)
}
else if (var(as.numeric(df[i, 1:5]), na.rm=T)!=0) {
df[i, "col3"] <- sum(unique(as.numeric(df[i, 1:5])), na.rm=T)
}
}
df
*Updated to work for more than two columns. *更新为适用于两列以上。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.