简体   繁体   English

为dplyr中的每个函数保存na.rm = TRUE

[英]Saving na.rm=TRUE for each function in dplyr

I am using dplyr summarise function. 我正在使用dplyr汇总功能。 My data contain NAs so I need to include na.rm=TRUE for each call. 我的数据包含NA,因此我需要为每个调用包含na.rm = TRUE。 for example: 例如:

group <- rep(c('a', 'b'), 3)
value <- c(1:4, NA, NA)
df = data.frame(group, value)

library(dplyr)
group_by(df, group) %>% summarise(

          mean = mean(value, na.rm=TRUE),

          sd = sd(value, na.rm=TRUE),

          min = min(value, na.rm=TRUE))

Is there a way to write the argument na.rm=TRUE only one time, and not on each row? 有没有办法只写一次参数na.rm = TRUE,而不是每行写一次?

You should use summarise_at , which lets you compute multiple functions for the supplied columns and set arguments that are shared among them: 您应该使用summarise_at ,它可以为提供的列计算多个函数并设置在它们之间共享的参数:

df %>% group_by(group) %>% 
  summarise_at("value", 
               funs(mean = mean, sd = sd, min = min), 
               na.rm = TRUE)

If you're planning to apply your functions to one column only, you can use filter(!is.na()) in order to filter out any NA values of this variable only (ie NA in other variables won't affect the process). 如果您打算仅将函数应用于一列,则可以使用filter(!is.na())以便仅过滤出该变量的所有NA值(即其他变量中的NA不会影响该过程) )。

group <- rep(c('a', 'b'), 3)
value <- c(1:4, NA, NA)
df = data.frame(group, value)

library(dplyr)

group_by(df, group) %>% 
  filter(!is.na(value)) %>%
  summarise(mean = mean(value),
            sd = sd(value),
            min = min(value))

# # A tibble: 2 x 4
#    group  mean       sd   min
#   <fctr> <dbl>    <dbl> <dbl>
# 1      a     2 1.414214     1
# 2      b     3 1.414214     2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM