用同一列按因子分组的方式替换data.table列中的NA

Question

I have the following sample data table 我有以下示例数据表

steps.dt = data.table(steps=rep(0:2, each=3), 
date=as.factor(rep(c("10/2/2012", "10/3/2012", "10/4/2012"), each = 3)), interval = as.factor(rep(c(0,5,10), each = 3)))

inserting a few NAs 插入一些NA

steps.dt[c(2,5,8),"steps"]=NA

the table now looks like this 桌子现在看起来像这样

   steps      date interval
1:     0 10/2/2012        0
2:    NA 10/2/2012        0
3:     0 10/2/2012        0
4:     1 10/3/2012        5
5:    NA 10/3/2012        5
6:     1 10/3/2012        5
7:     2 10/4/2012       10
8:    NA 10/4/2012       10
9:     2 10/4/2012       10

Now, I am trying to replace the NAs in the column "steps" with the means of steps grouped by the factor "interval" 现在，我尝试将“步骤”列中的NA替换为按因子“间隔”分组的步骤

I have looked at some of the posts on SO like this but that I need the replacement to be grouped by a factor is complicating it. 我看过一些关于这样的帖子像这样，但我需要更换由一个因素是复杂它进行分组。 Is there a way to do this without using a loop? 有没有一种方法可以不使用循环？ thank you! 谢谢！

Answer 1

We can use na.aggregate from zoo to replace the 'NA' with the mean of the 'steps' after grouping by 'interval' 我们可以使用na.aggregate从zoo与替换“NA” mean通过“间隔”分组后的“阶梯”

library(zoo)
steps.dt[, steps := na.aggregate(steps), interval]

Answer 2

Solution using dplyr 使用dplyr的解决方案

library(dplyr)
steps.dt = steps.dt %>% group_by(interval) %>% 
                        mutate(steps = ifelse(is.na(steps),mean(steps,na.rm = T),steps))

用同一列按因子分组的方式替换data.table列中的NA

问题描述

2 个解决方案

解决方案1
0 已采纳 2017-06-05 12:33:24

解决方案2
0 2017-06-05 13:05:22

用同一列按因子分组的方式替换data.table列中的NA

问题描述

2 个解决方案

解决方案1 0 已采纳 2017-06-05 12:33:24

解决方案2 0 2017-06-05 13:05:22

解决方案1
0 已采纳 2017-06-05 12:33:24

解决方案2
0 2017-06-05 13:05:22