[英]Replace NA in all columns of a dplyr chain
The question replace NA in a dplyr chain results into the solution 在dplyr链中替换NA的问题导致解决方案
dt %.% group_by(a) %.% mutate(b = ifelse(is.na(b), mean(b, na.rm = T), b))
with dplyr. 与dplyr。 I want to impute all colums with dplyr chain.
我想用dplyr链来估算所有colums。 There is no single column to group by, rather I want all numeric columns to have all NAs replaced by the means such as column means.
没有单个列可以分组,而是我希望所有数字列都通过诸如列均值之类的方式替换所有NAs。
What is the most elegant way to replace all NAs with column means with tidyverse/dp? 使用tidyverse / dp用列方法替换所有NA的最优雅方法是什么?
We can use mutate_all
with ifelse
我们可以将
mutate_all
与ifelse
mutate_all
使用
dt %>%
group_by(a) %>%
mutate_all(funs(ifelse(is.na(.), mean(., na.rm = TRUE), .)))
If we want a compact option, then use the na.aggregate
from zoo
which by default
replace NA
values with mean
如果我们想要一个紧凑的选项,那么使用
zoo
的na.aggregate
,它default
用mean
替换NA
值
dt %>%
group_by(a) %>%
mutate_all(zoo::na.aggregate)
If we don't have a grouping variable, then remove the group_by
and use mutate_if
(just to be cautious about having some non-numeric column) 如果我们没有分组变量,那么删除
group_by
并使用mutate_if
(只是为了谨慎使用一些非数字列)
dt %>%
mutate_if(is.numeric, zoo::na.aggregate)
If all the columns are numeric, even 如果所有列都是数字,甚至是
zoo::na.aggregate(dt)
set.seed(42)
dt <- data.frame(a = rep(letters[1:3], each = 3),
b= sample(c(NA, 1:5), 9, replace = TRUE),
c = sample(c(NA, 1:3), 9, replace = TRUE))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.