简体   繁体   中英

How can I in R, group by ID and summarise by mean with na.rm = TRUE

I want to group by ID and summarise, whilst removing NAs. please see example code below.

# Example data
ID <- c(1, 1, 1, 2, 2, 3, 3)
x <- c(2, 3, NA, 2, 3, 1, 1)
ID_x <- tibble(ID, x)

# 1. Works
ID_x %>%
  group_by(ID) %>% 
  summarise_each(mean)

# 2. Does not work with na.rm=TRUE
ID_x %>%
  group_by(ID) %>% 
  summarise_each(mean(., na.rm=TRUE))

Thanks in advance

Use the lambda ( ~

library(dplyr)
ID_x %>%
  group_by(ID) %>% 
  summarise_each(~ mean(., na.rm=TRUE))

-output

# A tibble: 3 × 2
     ID     x
  <dbl> <dbl>
1     1   2.5
2     2   2.5
3     3   1  

Also, in recent versions, the summarise_each will accompany a warning as these are deprecated in favor of across

ID_x %>%
  group_by(ID) %>% 
  summarise(across(everything(), ~ mean(., na.rm=TRUE)))

A different option would be using funs . You can also use this:

ID_x %>%
  group_by(ID) %>% 
  summarise_each(funs(mean(., na.rm = TRUE)))

Output:

# A tibble: 3 × 2
     ID     x
  <dbl> <dbl>
1     1   2.5
2     2   2.5
3     3   1  

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM