简体   繁体   中英

Conditional rowwise sum of subset of columns dplyr

My problem is kind of simple, but I'm not finding the right solution. Got a dataframe like this:

ID   name   var1  var2  var3
1     a       1     -1   2
2     b       2     3    2
3     c       1     -1  -1

And I need to get the sum from var1 to var3 of each number that is higher than zero in a var_total variable, like this:

ID   name   var1  var2  var3 var_total
1     a       1    -1    2      3
2     b       2     3    2      7
3     c       1    -1   -1      1

I managed to get the inconditional sum, like this:

 df %>% rowwise %>%  mutate(var_total = sum(c_across(starts_with('var'))))

I know there's the na.rm option, so I thought I maybe could temporarily transform the negative values into NAs, but I'm not sure if that's the right approach and if there's an easy way to get back the original numbers.

Thanks!

Using c_across and rowwise -

library(dplyr)

df %>%
  rowwise() %>%
  mutate(var_total = {
    x <- c_across(starts_with('var'))
    sum(x[x > 0])
    })

But a vectorised base R option would be -

cols <- grep('var', names(df))
df$var_total <- rowSums(df[cols] * +(df[cols] > 0))
df
#  ID name var1 var2 var3 var_total
#1  1    a    1   -1    2         3
#2  2    b    2    3    2         7
#3  3    c    1   -1   -1         1

Here is a base R one-liner,

rowSums(replace(df, df < 0, 0)[-c(1, 2)])
#[1] 3 7 1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM