[英]Mean excluding zero and na for all columns with dplyr
I want do do a mean of my dataframe with the dplyr package for all my colums. 我希望用我的所有列的dplyr包来做我的数据帧的意思。
n = c(NA, 3, 5)
s = c("aa", "bb", "cc")
b = c(3, 0, 5)
df = data.frame(n, s, b)
Here I want my function to get mean = 4 the n and b columns I tried mean(df$n[df$n>0])
buts it's not easy for a large dataframe. 在这里,我希望我的函数得到mean = 4我试过的n和b列的
mean(df$n[df$n>0])
但是这对于大型数据帧来说并不容易。 I want something like df %>% summarise_each(funs(mean))
... Thanks 我想要像
df %>% summarise_each(funs(mean))
...谢谢
Cf elegant David Answer : Cf优雅大卫答案:
df %>% summarise_each(funs(mean(.[!is.na(.) & . != 0])), -s)
Or 要么
df %>% summarise_each(funs(mean(.[. != 0], na.rm = TRUE)), -s)
If you don't want 0s it's probably that you consider them as NAs, so let's be explicit about it, then summarize numeric columns with na.rm = TRUE
: 如果你不想要0,你可能认为它们是NA,所以让我们明确一下,然后用
na.rm = TRUE
汇总数字列:
library(dplyr)
df[df==0] <- NA
summarize_if(df, is.numeric, mean, na.rm = TRUE)
# n b
# 1 4 4
As a one liner: 作为一个班轮:
summarize_if(`[<-`(df, df==0, value= NA), is.numeric, mean, na.rm = TRUE)
and in base R
(result as a named numeric vector) 并在基数
R
(结果作为命名数字向量)
sapply(`[<-`(df, df==0, value= NA)[sapply(df, is.numeric)], mean, na.rm=TRUE)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.