使用 rowSum 在 dplyr 中使用正則表達式進行條件突變

Question

假設我有以下 df

test = read.table(text = "total_score_1 total_score_2 partner_total_score_1 total_score_3 total_score_4 letter
                  1 -1 1 1 -1 B
                  1 1 1 -1 1 C
                  -1 -1 -1 -1 1 A", header = T)

我想匹配所有包含“total_score”但不包含“partner”一詞的列，然后創建一個新的度量值，對所有“total_score”列求和，將 -1 視為 0。

我可以像這樣獲取基本的 rowSum

mutate(net_correct = rowSums(select(., grep("total_score", names(.))))

但是請注意，這並不排除匹配“合作伙伴”一詞的可能性，我無法在單個grep命令中找到如何做到這一點。

但是，我現在想創建一個total_correct值，它是相同列上的 rowSum，但 -1 被視為 0。

這將導致一個 data.frame 像這樣：

  total_score_1 total_score_2 partner_total_score_1 total_score_3 total_score_4 letter total_sum
1             1            -1                     1             1            -1      B         2
2             1             1                     1            -1             1      C         3
3            -1            -1                    -1            -1             1      A         1

一種方法可能是只計算“1”的總數（而不是實際求和），但我無法弄清楚如何在 mutate 命令中這樣做

Answer 1

您可以簡單地修改您的正則表達式以僅使用插入符號捕獲以“total_score”開頭的列：

mutate(net_correct = rowSums(select(., grep("^total_score", names(.)))))

要將負數視為零，可以使用mutate_all() ：

test %>%
  mutate(total_correct = rowSums(select(., grep("^total_score", names(.))) %>% 
                                 mutate_all(function(x){as.numeric(x>0)})
                              )
  )

Answer 2

你可以這樣做：

test %>% 
mutate(net_correct = select(.,setdiff(contains("total_score"), contains("partner"))) %>%  replace(., . == -1, 0) %>%  rowSums())

#  total_score_1 total_score_2 partner_total_score_1 total_score_3 total_score_4 letter net_correct
#1             1            -1                     1             1            -1      B           2
#2             1             1                     1            -1             1      C           3
#3            -1            -1                    -1            -1             1      A           1

Answer 3

另一種可能是：

test %>%
 mutate(net_correct = rowSums(select(., contains("total"), -contains("partner")) %>% 
                               replace(., . == -1, 0)))

  total_score_1 total_score_2 partner_total_score_1 total_score_3 total_score_4
1             1            -1                     1             1            -1
2             1             1                     1            -1             1
3            -1            -1                    -1            -1             1
  letter net_correct
1      B           2
2      C           3
3      A           1

使用 rowSum 在 dplyr 中使用正則表達式進行條件突變

問題描述

3 個解決方案

解決方案1
1 2019-11-20 16:16:30

解決方案2
1 已采納 2019-11-20 16:28:51

解決方案3
1 2019-11-20 16:29:55

使用 rowSum 在 dplyr 中使用正則表達式進行條件突變

問題描述

3 個解決方案

解決方案1 1 2019-11-20 16:16:30

解決方案2 1 已采納 2019-11-20 16:28:51

解決方案3 1 2019-11-20 16:29:55

解決方案1
1 2019-11-20 16:16:30

解決方案2
1 已采納 2019-11-20 16:28:51

解決方案3
1 2019-11-20 16:29:55