[英]Conditional rowwise sum of subset of columns dplyr
My problem is kind of simple, but I'm not finding the right solution.我的问题有点简单,但我没有找到正确的解决方案。 Got a dataframe like this:
得到一个 dataframe 像这样:
ID name var1 var2 var3
1 a 1 -1 2
2 b 2 3 2
3 c 1 -1 -1
And I need to get the sum from var1 to var3 of each number that is higher than zero in a var_total
variable, like this:我需要从 var1 到 var3 获取
var_total
变量中大于零的每个数字的总和,如下所示:
ID name var1 var2 var3 var_total
1 a 1 -1 2 3
2 b 2 3 2 7
3 c 1 -1 -1 1
I managed to get the inconditional sum, like this:我设法得到了无条件的总和,如下所示:
df %>% rowwise %>% mutate(var_total = sum(c_across(starts_with('var'))))
I know there's the na.rm
option, so I thought I maybe could temporarily transform the negative values into NAs, but I'm not sure if that's the right approach and if there's an easy way to get back the original numbers.我知道有
na.rm
选项,所以我想我可能可以暂时将负值转换为 NA,但我不确定这是否是正确的方法,以及是否有一种简单的方法可以取回原始数字。
Thanks!谢谢!
Using c_across
and rowwise
-使用
c_across
和rowwise
-
library(dplyr)
df %>%
rowwise() %>%
mutate(var_total = {
x <- c_across(starts_with('var'))
sum(x[x > 0])
})
But a vectorised base R option would be -但是向量化的基础 R 选项将是 -
cols <- grep('var', names(df))
df$var_total <- rowSums(df[cols] * +(df[cols] > 0))
df
# ID name var1 var2 var3 var_total
#1 1 a 1 -1 2 3
#2 2 b 2 3 2 7
#3 3 c 1 -1 -1 1
Here is a base R one-liner,这是一个基本的 R 单线,
rowSums(replace(df, df < 0, 0)[-c(1, 2)])
#[1] 3 7 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.