在dplyr链的所有列中替换NA

Question

The question replace NA in a dplyr chain results into the solution 在dplyr链中替换NA的问题导致解决方案

dt %.% group_by(a) %.% mutate(b = ifelse(is.na(b), mean(b, na.rm = T), b))

with dplyr. 与dplyr。 I want to impute all colums with dplyr chain. 我想用dplyr链来估算所有colums。 There is no single column to group by, rather I want all numeric columns to have all NAs replaced by the means such as column means. 没有单个列可以分组，而是我希望所有数字列都通过诸如列均值之类的方式替换所有NAs。

What is the most elegant way to replace all NAs with column means with tidyverse/dp? 使用tidyverse / dp用列方法替换所有NA的最优雅方法是什么？

Answer 1

We can use mutate_all with ifelse 我们可以将mutate_all与ifelse mutate_all使用

dt %>%
   group_by(a) %>% 
   mutate_all(funs(ifelse(is.na(.), mean(., na.rm = TRUE), .)))

If we want a compact option, then use the na.aggregate from zoo which by default replace NA values with mean 如果我们想要一个紧凑的选项，那么使用zoo的na.aggregate ，它default用mean替换NA值

dt %>% 
   group_by(a) %>% 
   mutate_all(zoo::na.aggregate)

If we don't have a grouping variable, then remove the group_by and use mutate_if (just to be cautious about having some non-numeric column) 如果我们没有分组变量，那么删除group_by并使用mutate_if （只是为了谨慎使用一些非数字列）

dt %>%
   mutate_if(is.numeric, zoo::na.aggregate)

If all the columns are numeric, even 如果所有列都是数字，甚至是

zoo::na.aggregate(dt)

data 数据

set.seed(42)
dt <- data.frame(a = rep(letters[1:3], each = 3),
                 b= sample(c(NA, 1:5), 9, replace = TRUE), 
                 c = sample(c(NA, 1:3), 9, replace = TRUE))

在dplyr链的所有列中替换NA

问题描述

1 个解决方案

解决方案1
9 已采纳 2018-01-02 10:19:05

data 数据

在dplyr链的所有列中替换NA

问题描述

1 个解决方案

解决方案1 9 已采纳 2018-01-02 10:19:05

data 数据

解决方案1
9 已采纳 2018-01-02 10:19:05