[英]Calculating rowMean ignoring 0 values
I would like to calculate the mean of column x
and y
as below and add a column Mean
,我想如下计算列
x
和y
的平均值并添加一列Mean
,
> z
w x y
1 5 1 1
2 6 2 2
3 7 3 3
4 8 4 0
I am using the code like:我正在使用如下代码:
z$mean <- rowMeans(subset(z, select = c(x, y)), na.rm = TRUE)
but I don't know how to ignore the 0 in the last y
value;但我不知道如何忽略最后一个
y
值中的 0; the mean of that row for x
and y
values will be 4 only.该行的
x
和y
值的平均值仅为 4。
Output desired: Output 需要:
> z
w x y mean
1 5 1 1 1
2 6 2 2 2
3 7 3 3 3
4 8 4 0 4
We can replace
the 0
to NA
and then with na.rm
it can be ignored我们可以将
0
replace
为NA
,然后用na.rm
可以忽略它
subz <- z[, c('x', 'y')]
z$Mean <- rowMeans(replace(subz, subz == 0, NA), na.rm = TRUE)
z
# w x y Mean
#1 5 1 1 1
#2 6 2 2 2
#3 7 3 3 3
#4 8 4 0 4
Or using dplyr
或使用
dplyr
library(dplyr)
z %>%
# // replace the 0s to NA for the columns x, y
mutate(across(x:y, na_if, 0)) %>% # // => 0 -> NA
# // get the row means of columns x,y
transmute(z = select(., x:y) %>%
rowMeans(na.rm = TRUE)) %>%
# // bind with original dataset
bind_cols(z, .)
z <- structure(list(w = 5:8, x = 1:4, y = c(1L, 2L, 3L, 0L)),
class = "data.frame", row.names = c("1",
"2", "3", "4"))
Another alternative:另一种选择:
z$Mean <- apply(z[c('x','y')], MARGIN=1, FUN=function(x) mean(x[x!=0]))
apply(., 1, mean)
is slightly less efficient than rowMeans
but more flexible. apply(., 1, mean)
的效率略低于rowMeans
,但更灵活。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.