[英]Replacing NAs witha new factor level in one column based on factor level in another column using data.table
DATA = data.table(col_1 = factor(c("A", "B", "C", "C", "B", "A", "C")),
col_2 = factor(c("stuff", NA, NA, "stuff", NA, "different_stuff", NA)))
I have a big data set in which I would like to replace the NAs
from col2
, that correspond to C
from col1
, with a new factor level, eg yet_another_stuff
. 我有一个大数据集,其中我要用新的因子级别(例如
yet_another_stuff
替换col2
中对应于col1
C
的NAs
。 There are more NAs
than there are observations with C
level and I don't want to replace the NAs
that belong to other level like B
. NAs
超过了C
级的观测值,我不想替换像B
一样属于其他级别的NAs
。
After uploading this data set the columns are already of class factor. 上载此数据集后,列已属于类别因子。
I would highly prefer to do so using data.table
package due to the size of the data set. 由于数据集的大小,我非常希望使用
data.table
包来这样做。
我们可以在i
指定逻辑条件,并在'col_2'中分配与'yet_another_stuff'条件对应的那些值
DATA[is.na(col_2) & col_1 == "C", col_2 := "yet_another_stuff"]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.