简体   繁体   中英

Calculating rowMean ignoring 0 values

I would like to calculate the mean of column x and y as below and add a column Mean ,

> z
  w x  y 
1 5 1  1    
2 6 2  2    
3 7 3  3    
4 8 4  0    

I am using the code like:

z$mean <- rowMeans(subset(z, select = c(x, y)), na.rm = TRUE)

but I don't know how to ignore the 0 in the last y value; the mean of that row for x and y values will be 4 only.

Output desired:

> z
  w x  y mean
1 5 1  1    1
2 6 2  2    2
3 7 3  3    3
4 8 4  0    4

We can replace the 0 to NA and then with na.rm it can be ignored

subz <- z[, c('x', 'y')]
z$Mean <- rowMeans(replace(subz, subz == 0, NA), na.rm = TRUE)
z
#  w x y Mean
#1 5 1 1    1
#2 6 2 2    2
#3 7 3 3    3
#4 8 4 0    4

Or using dplyr

library(dplyr)
z %>%
  # // replace the 0s to NA for the columns x, y
  mutate(across(x:y, na_if, 0)) %>% # // => 0 -> NA
  # // get the row means of columns x,y
  transmute(z = select(., x:y) %>%
                    rowMeans(na.rm = TRUE)) %>% 
  # // bind with original dataset
  bind_cols(z, .)

data

z <- structure(list(w = 5:8, x = 1:4, y = c(1L, 2L, 3L, 0L)), 
  class = "data.frame", row.names = c("1", 
"2", "3", "4"))

Another alternative:

z$Mean <- apply(z[c('x','y')], MARGIN=1, FUN=function(x) mean(x[x!=0]))

apply(., 1, mean) is slightly less efficient than rowMeans but more flexible.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM