忽略 R 中某些值的行中的总和值

Question

I have a follow-up on this question: Sum values from rows with conditions in R我对这个问题进行了跟进： Sum values from rows with conditions in R

Here is my data:这是我的数据：

ID <- c(A,B,C,D,E,F)
Q1 <- c(0,1,7,9,na,3) 
Q2 <- c(0,3,2,2,na,3) 
Q3 <- c(0,0,7,9,na,3) 

dta <- as.data.frame (ID,Q1,Q2,Q3)

I need to sum every value below 7, but in lines with values over 7, I need to sum all the numbers below 7 and ignore the ones over it.我需要对低于 7 的每个值求和，但在值超过 7 的行中，我需要对所有低于 7 的数字求和并忽略超过它的数字。 Rows with all NAs should be preserved.应保留具有所有 NA 的行。 Result should look like this:结果应如下所示：

I have tried this code based on the response from the last post:我根据上一篇文章的回复尝试了这段代码：

dta2  <- dta %>% rowwise() %>% mutate(ProxySum = ifelse(all(c_across(Q1:Q3) < 7), Reduce(`+`, c_across(Q1:Q3)), (ifelse(any(c_across(Q1:Q3) > 7), sum(.[. <  7]), NA))))

But in the rows with numbers over 7 I end up with a sum of all the rows and columns.但是在数字超过 7 的行中，我最终得到所有行和列的总和。 What I am missing?我错过了什么？

Answer 1

Another option making use of rowSums and dplyr::across :另一个使用rowSums和dplyr::across的选项：

ID <- LETTERS[1:6]
Q1 <- c(0,1,7,9,NA,3) 
Q2 <- c(0,3,2,2,NA,3) 
Q3 <- c(0,0,7,9,NA,3) 

dta <- data.frame(ID,Q1,Q2,Q3) 

library(dplyr)

dta %>% 
  mutate(ProxySum = rowSums(across(Q1:Q3, function(.x) { .x[.x >= 7] <- 0; .x })))
#>   ID Q1 Q2 Q3 ProxySum
#> 1  A  0  0  0        0
#> 2  B  1  3  0        4
#> 3  C  7  2  7        2
#> 4  D  9  2  9        2
#> 5  E NA NA NA       NA
#> 6  F  3  3  3        9

Answer 2

How about a slightly different approach - first pivot longer, then sum by condition by group, then pivot back.稍微不同的方法怎么样 - 首先 pivot 更长，然后按条件按组求和，然后返回 pivot。

library(tidyverse)

ID <- c("A","B","C","D","E","F")
Q1 <- c(0,1,7,9,NA,3) 
Q2 <- c(0,3,2,2,NA,3) 
Q3 <- c(0,0,7,9,NA,3) 

dta <- data.frame(ID,Q1,Q2,Q3) 

dta %>%
  pivot_longer(-ID) %>%
  group_by(ID) %>%
  mutate(ProxySum = sum(value[which(value<7)])) %>%
  pivot_wider()
#> # A tibble: 6 × 5
#> # Groups:   ID [6]
#>   ID    ProxySum    Q1    Q2    Q3
#>   <chr>    <dbl> <dbl> <dbl> <dbl>
#> 1 A            0     0     0     0
#> 2 B            4     1     3     0
#> 3 C            2     7     2     7
#> 4 D            2     9     2     9
#> 5 E            0    NA    NA    NA
#> 6 F            9     3     3     3

^{Created on 2021-12-14 by the reprex package (v2.0.1)}^{由reprex package (v2.0.1) 于 2021 年 12 月 14 日创建}

Answer 3

One way to do it in base :在base中执行此操作的一种方法：

rowSums(dta[, 2:4] * (dta[, 2:4] < 7))

# [1]  0  4  2  2 NA  9

Answer 4

Here is another dplyr solution:这是另一个dplyr解决方案：

library(dplyr)
dta %>% 
  mutate(across(where(is.numeric), ~ifelse(.>=7,0,.)),
         sum = rowSums(across(where(is.numeric))))

  ID Q1 Q2 Q3 sum
1  A  0  0  0   0
2  B  1  3  0   4
3  C  0  2  0   2
4  D  0  2  0   2
5  E NA NA NA  NA
6  F  3  3  3   9

忽略 R 中某些值的行中的总和值

问题描述

4 个解决方案

解决方案1
1 2021-12-14 19:30:40

解决方案2
0 2021-12-14 19:26:19

解决方案3
0 2021-12-14 19:48:59

解决方案4
0 2021-12-14 21:37:31

忽略 R 中某些值的行中的总和值

问题描述

4 个解决方案

解决方案1 1 2021-12-14 19:30:40

解决方案2 0 2021-12-14 19:26:19

解决方案3 0 2021-12-14 19:48:59

解决方案4 0 2021-12-14 21:37:31

解决方案1
1 2021-12-14 19:30:40

解决方案2
0 2021-12-14 19:26:19

解决方案3
0 2021-12-14 19:48:59

解决方案4
0 2021-12-14 21:37:31