R 中的條件行總和

Question

    a   avalue  b   bvalue
1  12   yes     3   no
2  13   yes     3   yes
3  14   no      2   no
4  NA   no      1   no
5  16   NA      1   yes

我正在嘗試計算每一行中yes的總數，因此 output 將是這樣的：

這是我的解決方案，它不起作用。 df$count <- rowSums(data[data(3) | data(5) == 'yes',c(2,4)], na.rm=TRUE)

Answer 1

編輯：

OP 編輯了帖子以在輸入數據中包含標題，從評論來看，OP 似乎希望解決方案擴展到多列對。 這是基礎 R 中的一個解決方案，應該這樣做：

raw <- "
   a   avalue  b   bvalue
1  12   yes     3   no
2  13   yes     3   yes
3  14   no      2   no
4  NA   no      1   no
5  16   NA      1   yes "

df <- read.table(text = raw, header = TRUE)

use <- endsWith(colnames(df), "value")
df[use] <- ifelse(df[use] == "yes", TRUE, FALSE)
df[is.na(df)] <- 0
rowSums(df[use] * df[!use])
#>  1  2  3  4  5 
#> 12 16  0  0  1

^{由代表 package (v0.3.0) 於 2021 年 2 月 20 日創建}

原帖：

另一種做法：

raw <- "1  12   yes     3   no
2  13   yes     3   yes
3  14   no      2   no
4  NA   no      1   no
5  16   NA      1   yes"

df <- read.table(text = raw)

suppressPackageStartupMessages({
  library(dplyr)
  library(tidyr)
})

df %>%
  setNames(c("row", "value_first", "use_first", "value_second", "use_second")) %>%
  pivot_longer(!row, names_to = c(".value", "column"), names_sep = "_") %>%
  replace_na(list(value = 0, use = "no")) %>%
  group_by(row) %>%
  summarise(total = sum(value * (use == "yes")))
#> # A tibble: 5 x 2
#>     row total
#> * <int> <dbl>
#> 1     1    12
#> 2     2    16
#> 3     3     0
#> 4     4     0
#> 5     5     1

^{由reprex package (v0.3.0) 於 2021 年 2 月 18 日創建}

Answer 2

或者使用基礎 R，您可以簡單地對值列上滿足條件的行進行元素乘法，然后應用rowSums() ：

raw <- "1  12   yes     3   no
2  13   yes     3   yes
3  14   no      2   no
4  NA   no      1   no
5  16   NA      1   yes"

df <- read.table(text = raw)

rowSums((!is.na(df[,c(3,5)])&df[,c(3,5)]=="yes") * df[,c(2,4)], na.rm=TRUE)
#> [1] 12 16  0  0  1

## Explanation:
# 1) Select relevant rows
(rows_select <- !is.na(df[,c(3,5)])&df[,c(3,5)]=="yes")
#>         V3    V5
#> [1,]  TRUE FALSE
#> [2,]  TRUE  TRUE
#> [3,] FALSE FALSE
#> [4,] FALSE FALSE
#> [5,] FALSE  TRUE

# 2) multiply by the columns with the data:
(rows_sel_val <- rows_select * df[,c(2,4)])
#>   V2 V4
#> 1 12  0
#> 2 13  3
#> 3  0  0
#> 4 NA  0
#> 5  0  1

# 3) Apply rowSums
rowSums(rows_sel_val, na.rm=TRUE)
#> [1] 12 16  0  0  1

^{由代表 package (v1.0.0) 於 2021 年 2 月 18 日創建}

Answer 3

1）創建一個新的數據框 df0 ，其中 df 中的每個 NA 為 0，然后在其上使用指示的公式。 不使用任何包。

df0 <- replace(df, is.na(df), 0)
transform(df, count = with(df0, a * (avalue == "yes") + b * (bvalue == "yes")))

給予：

   a avalue b bvalue count
1 12    yes 3     no    12
2 13    yes 3    yes    16
3 14     no 2     no     0
4 NA     no 1     no     0
5 16   <NA> 1    yes     1

2）或者如果不僅僅是 a 和 b 那么這會給出相同的結果，但可以處理任意數量的列。 ok 挑選出 a、b 等列，ok 挑選出 avalue、bvalue。 等欄目。 請注意，R 將自動回收 ok 和 !ok 到等於列數的長度。

ok <- c(TRUE, FALSE)
transform(df, count = rowSums(df[ok] * (df[!ok] == "yes"), na.rm = TRUE))

2a)使用折疊 package，(2) 的一個變體是使用 num_vars 和 cat_vars 來挑選數字和分類列。

請注意，如果任何數字列都是 NA，那么它們必須使用 NA_real_ 或 NA_integer_ 設置，而不僅僅是 NA，因為 num_vars 按類型提取列。 這可以通過確保 logi_vars(df) 沒有列來檢查（因為普通的 NA 是邏輯的），或者如果任何列都可能是 NA，則只需使用 (2)。

library(collapse)

transform(df, count = rowSums(num_vars(df0) * (cat_vars(df0) == "yes"), na.rm = TRUE))

筆記

可重現形式的輸入是：

Lines <- "
    a   avalue  b   bvalue
1  12   yes     3   no
2  13   yes     3   yes
3  14   no      2   no
4  NA   no      1   no
5  16   NA      1   yes"
df <- read.table(text = Lines)

Answer 4

我會整理我的數據並使用 tidyverse 按組計算總和：

library(tidyverse)
df<-read.table(text = "1  12   yes     3   no
2  13   yes     3   yes
3  14   no      2   no
4  NA   no      1   no
5  16   NA      1   yes")

bind_rows(df[1:3], setNames(df[c(1,4:5)], paste0("V",1:3))) %>%
group_by(V1, V3) %>%
summarise(sum(V2, na.rm = TRUE))

#> Groups:   V1 [5]
#>     V1 V3    `sum(V2, na.rm = TRUE)`
#>  <int> <chr>                   <int>
#>1     1 no                          3
#>2     1 yes                        12
#>3     2 yes                        16
#>4     3 no                         16
#>5     4 no                          1
#>6     5 yes                         1
#>7     5 <NA>                       16

R 中的條件行總和

問題描述

4 個解決方案

解決方案1
1 2021-02-18 16:25:51

解決方案2
1 2021-02-18 17:08:12

解決方案3
1 2021-02-18 17:08:49

筆記

解決方案4
0 2021-02-18 16:17:27

R 中的條件行總和

問題描述

4 個解決方案

解決方案1 1 2021-02-18 16:25:51

解決方案2 1 2021-02-18 17:08:12

解決方案3 1 2021-02-18 17:08:49

筆記

解決方案4 0 2021-02-18 16:17:27

解決方案1
1 2021-02-18 16:25:51

解決方案2
1 2021-02-18 17:08:12

解決方案3
1 2021-02-18 17:08:49

解決方案4
0 2021-02-18 16:17:27