简体   繁体   English

对于具有dplyr的所有列,平均值不包括零和na

[英]Mean excluding zero and na for all columns with dplyr

I want do do a mean of my dataframe with the dplyr package for all my colums. 我希望用我的所有列的dplyr包来做我的数据帧的意思。

n = c(NA, 3, 5) 
s = c("aa", "bb", "cc") 
b = c(3, 0, 5) 
df = data.frame(n, s, b)

Here I want my function to get mean = 4 the n and b columns I tried mean(df$n[df$n>0]) buts it's not easy for a large dataframe. 在这里,我希望我的函数得到mean = 4我试过的n和b列的mean(df$n[df$n>0])但是这对于大型数据帧来说并不容易。 I want something like df %>% summarise_each(funs(mean)) ... Thanks 我想要像df %>% summarise_each(funs(mean)) ...谢谢

Cf elegant David Answer : Cf优雅大卫答案:

df %>% summarise_each(funs(mean(.[!is.na(.) & . != 0])), -s) 

Or 要么

df %>% summarise_each(funs(mean(.[. != 0], na.rm = TRUE)), -s)

If you don't want 0s it's probably that you consider them as NAs, so let's be explicit about it, then summarize numeric columns with na.rm = TRUE : 如果你不想要0,你可能认为它们是NA,所以让我们明确一下,然后用na.rm = TRUE汇总数字列:

library(dplyr)
df[df==0] <- NA
summarize_if(df, is.numeric, mean, na.rm = TRUE)
#   n b
# 1 4 4

As a one liner: 作为一个班轮:

summarize_if(`[<-`(df, df==0, value= NA), is.numeric, mean, na.rm = TRUE)

and in base R (result as a named numeric vector) 并在基数R (结果作为命名数字向量)

sapply(`[<-`(df, df==0, value= NA)[sapply(df, is.numeric)], mean, na.rm=TRUE)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM