简体   繁体   English

列子集 dplyr 的条件行和

[英]Conditional rowwise sum of subset of columns dplyr

My problem is kind of simple, but I'm not finding the right solution.我的问题有点简单,但我没有找到正确的解决方案。 Got a dataframe like this:得到一个 dataframe 像这样:

ID   name   var1  var2  var3
1     a       1     -1   2
2     b       2     3    2
3     c       1     -1  -1

And I need to get the sum from var1 to var3 of each number that is higher than zero in a var_total variable, like this:我需要从 var1 到 var3 获取var_total变量中大于零的每个数字的总和,如下所示:

ID   name   var1  var2  var3 var_total
1     a       1    -1    2      3
2     b       2     3    2      7
3     c       1    -1   -1      1

I managed to get the inconditional sum, like this:我设法得到了无条件的总和,如下所示:

 df %>% rowwise %>%  mutate(var_total = sum(c_across(starts_with('var'))))

I know there's the na.rm option, so I thought I maybe could temporarily transform the negative values into NAs, but I'm not sure if that's the right approach and if there's an easy way to get back the original numbers.我知道有na.rm选项,所以我想我可能可以暂时将负值转换为 NA,但我不确定这是否是正确的方法,以及是否有一种简单的方法可以取回原始数字。

Thanks!谢谢!

Using c_across and rowwise -使用c_acrossrowwise -

library(dplyr)

df %>%
  rowwise() %>%
  mutate(var_total = {
    x <- c_across(starts_with('var'))
    sum(x[x > 0])
    })

But a vectorised base R option would be -但是向量化的基础 R 选项将是 -

cols <- grep('var', names(df))
df$var_total <- rowSums(df[cols] * +(df[cols] > 0))
df
#  ID name var1 var2 var3 var_total
#1  1    a    1   -1    2         3
#2  2    b    2    3    2         7
#3  3    c    1   -1   -1         1

Here is a base R one-liner,这是一个基本的 R 单线,

rowSums(replace(df, df < 0, 0)[-c(1, 2)])
#[1] 3 7 1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM