I have 4 columns in a dataframe of 244 columns. I need to do a sum over these columns, which can be done with a simple sum function. However, the sum is not taking into consideration the nas. So when I run:
df <- d%>%
rowwise() %>%
mutate(DV = sum(x1, x2, x3, x4, na.rm=TRUE))
I am getting 0, when all the values are NA, I would like to get NA when all the values in the x1[2], x2[2], x3[2] and x4[2] are zero. I have been scratching my head for 3 hours on this, and no joy. I did create (gleaned) a function from the internet, but it is still not working:
sum0 <- function(x, ...){if(sum(is.na(x))==4) return(NA_real_) else(sum(x, ..., na.rm=TRUE))}
df <- d%>%
rowwise() %>%
mutate(DV = sum0(x1, x2, x3, x4, na.rm=TRUE))
It is not working because the is.na value is not counting the values correctly. I am really stuck here, any help would be very appreciated.
Hypothetical data
# Create a, b, c, d variables
a <- c('a1', 'a2', 'a3', 'a4')
b <- c(10, NA, 30, 40)
c <- c(2.5, NA, 8, 1)
d <- c(2.5, NA, 10, 7)
e <- c(2.5, NA, 10, 7)
# Join the variables to create a data frame
df <- data.frame(a, b, c, d, e)
dfx <- df %>% rowwise() %>% mutate(DV = sum0(c(b,c,d,e)), na.rm = TRUE)
Here I would want the value in DV[2] to be NA, the rest summed normally.
You can do:
df %>%
mutate(DV = rowSums(select(., b:e)))
a b c d e DV
1 a1 10 2.5 2.5 2.5 17.5
2 a2 NA NA NA NA NA
3 a3 30 8.0 10.0 10.0 58.0
4 a4 40 1.0 7.0 7.0 55.0
If there could be rows with just a few NAs:
df %>%
mutate(DV = rowSums(select(., b:e), na.rm = TRUE) * NA ^ (rowSums(!is.na(select(., b:e))) == 0))
Alternatively, you can also do:
df %>%
filter_at(vars(b:e), any_vars(!is.na(.))) %>%
mutate(DV = rowSums(select(., b:e), na.rm = TRUE)) %>%
bind_rows(df %>%
filter_at(vars(b:e), all_vars(is.na(.))))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.