列子集 dplyr 的条件行和

Question

My problem is kind of simple, but I'm not finding the right solution.我的问题有点简单，但我没有找到正确的解决方案。 Got a dataframe like this:得到一个 dataframe 像这样：

ID   name   var1  var2  var3
1     a       1     -1   2
2     b       2     3    2
3     c       1     -1  -1

And I need to get the sum from var1 to var3 of each number that is higher than zero in a var_total variable, like this:我需要从 var1 到 var3 获取var_total变量中大于零的每个数字的总和，如下所示：

ID   name   var1  var2  var3 var_total
1     a       1    -1    2      3
2     b       2     3    2      7
3     c       1    -1   -1      1

I managed to get the inconditional sum, like this:我设法得到了无条件的总和，如下所示：

 df %>% rowwise %>%  mutate(var_total = sum(c_across(starts_with('var'))))

I know there's the na.rm option, so I thought I maybe could temporarily transform the negative values into NAs, but I'm not sure if that's the right approach and if there's an easy way to get back the original numbers.我知道有na.rm选项，所以我想我可能可以暂时将负值转换为 NA，但我不确定这是否是正确的方法，以及是否有一种简单的方法可以取回原始数字。

Thanks!谢谢！

Answer 1

Using c_across and rowwise -使用c_across和rowwise -

library(dplyr)

df %>%
  rowwise() %>%
  mutate(var_total = {
    x <- c_across(starts_with('var'))
    sum(x[x > 0])
    })

But a vectorised base R option would be -但是向量化的基础 R 选项将是 -

cols <- grep('var', names(df))
df$var_total <- rowSums(df[cols] * +(df[cols] > 0))
df
#  ID name var1 var2 var3 var_total
#1  1    a    1   -1    2         3
#2  2    b    2    3    2         7
#3  3    c    1   -1   -1         1

Answer 2

Here is a base R one-liner,这是一个基本的 R 单线，

rowSums(replace(df, df < 0, 0)[-c(1, 2)])
#[1] 3 7 1

列子集 dplyr 的条件行和

问题描述

2 个解决方案

解决方案1
1 2022-01-12 13:54:41

解决方案2
1 2022-01-12 13:55:13

列子集 dplyr 的条件行和

问题描述

2 个解决方案

解决方案1 1 2022-01-12 13:54:41

解决方案2 1 2022-01-12 13:55:13

解决方案1
1 2022-01-12 13:54:41

解决方案2
1 2022-01-12 13:55:13