使用data.table根据另一列中的因子水平在一个列中用新因子水平替换NAs

Question

DATA = data.table(col_1 = factor(c("A", "B", "C", "C", "B", "A", "C")),
                  col_2 = factor(c("stuff", NA, NA, "stuff", NA, "different_stuff", NA)))

I have a big data set in which I would like to replace the NAs from col2 , that correspond to C from col1 , with a new factor level, eg yet_another_stuff . 我有一个大数据集，其中我要用新的因子级别（例如yet_another_stuff替换col2中对应于col1 C的NAs 。 There are more NAs than there are observations with C level and I don't want to replace the NAs that belong to other level like B . NAs超过了C级的观测值，我不想替换像B一样属于其他级别的NAs 。

After uploading this data set the columns are already of class factor. 上载此数据集后，列已属于类别因子。

I would highly prefer to do so using data.table package due to the size of the data set. 由于数据集的大小，我非常希望使用data.table包来这样做。

Answer 1

我们可以在i指定逻辑条件，并在'col_2'中分配与'yet_another_stuff'条件对应的那些值

DATA[is.na(col_2) & col_1 == "C", col_2 := "yet_another_stuff"]

使用data.table根据另一列中的因子水平在一个列中用新因子水平替换NAs

问题描述

1 个解决方案

解决方案1
0 已采纳 2018-10-26 20:02:31

使用data.table根据另一列中的因子水平在一个列中用新因子水平替换NAs

问题描述

1 个解决方案

解决方案1 0 已采纳 2018-10-26 20:02:31

解决方案1
0 已采纳 2018-10-26 20:02:31