[英]How to create a variable based on the values from more than one column in r
I have a data frame that has three variables with the valid values of 1,2,3,4,5,6,7 for each variable. 我有一个数据框,有三个变量,每个变量的有效值为1,2,3,4,5,6,7。 If there isn't a numeric value assigned to the variable, it will show
NA
. 如果没有为变量分配数值,则会显示
NA
。 The data frame a
looks like below: 数据框
a
如下所示:
ak_eth co_eth pa_eth
1 NA 1 NA
2 NA NA 1
3 NA NA NA
4 2 NA NA
5 NA NA 4
6 NA NA NA
Each row could have NA
across all three variables or have only one value in one of the three variables. 每行可以在所有三个变量中具有
NA
,或者在三个变量之一中仅具有一个值。 I want to create a new variable called recode
that takes values from the existing three variables. 我想创建一个名为
recode
的新变量,它从现有的三个变量中获取值。 If all three existing variables are NA
, the new value is NA
; 如果所有三个现有变量都是
NA
,则新值为NA
; if one of the three existing variables has a value, then take that value for the new variable. 如果三个现有变量中的一个具有值,则将该值作为新变量。 I've tried this, but it seems didn't work for me.
我试过这个,但似乎对我不起作用。
a$recode[is.na(a$ak_eth) & is.na(a$co_eth) & is.na(a$pa_eth)] <- "NA"
library(car)
a$recode <- recode(a$ak_eth, "1=1;2=2;3=3;4=4;5=5;6=6;7=7")
a$recode <- recode(a$co_eth, "1=1;2=2;3=3;4=4;5=5;6=6;7=7")
a$recode <- recode(a$pa_eth, "1=1;2=2;3=3;4=4;5=5;6=6;7=7")
Any suggestions will be appreciated. 任何建议将不胜感激。 Thanks!
谢谢!
We can use pmax
我们可以使用
pmax
a$Recode_Var <- do.call(pmax, c(a, na.rm = TRUE))
Or use pmin
或者使用
pmin
a$Recode_Var <- do.call(pmin, c(a, na.rm = TRUE))
Or another option is rowSums
或者另一个选项是
rowSums
r1 <- rowSums(a, na.rm = TRUE)
a$Recode_Var <- replace(r1, r1==0, NA)
NOTE: According to the OP's post Each row could have NA across all three variables or have only one value in one of the three variables
注意:根据OP的帖子
Each row could have NA across all three variables or have only one value in one of the three variables
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.