[英]Compute row-wise counts in subsets of columns in dplyr
我想使用 dplyr 跨列的子集逐行計算某些文本(或因子級別)的實例數。
這是輸入:
> input_df
num_col_1 num_col_2 text_col_1 text_col_2
1 1 4 yes yes
2 2 5 no yes
3 3 6 no <NA>
這是所需的輸出:
> output_df
num_col_1 num_col_2 text_col_1 text_col_2 sum_yes
1 1 4 yes yes 2
2 2 5 no yes 1
3 3 6 no <NA> 0
在sum_yes
我們計算了該行中“是”的數量。
我嘗試了兩種方法:
嘗試的解決方案1:
text_cols = c("text_col_1","text_col_2")
df = input_df %>% mutate(sum_yes = rowSums( select(text_cols) == "yes" ), na.rm = TRUE)
錯誤:
Error in mutate_impl(.data, dots) :
Evaluation error: no applicable method for 'select_' applied to an object of class "character".
嘗試的解決方案2:
text_cols = c("text_col_1","text_col_2")
df = input_df %>% select(text_cols) %>% rowsum("yes", na.rm = TRUE)
錯誤:
Error in rowsum.data.frame(., "yes", na.rm = TRUE) :
incorrect length for 'group'
mutate
並為每行計算“是”數量的總和。library(dplyr)
df %>% mutate(sum_yes = rowSums(.[text_cols] == "yes"))
# num_col_1 num_col_2 text_col_1 text_col_2 sum_yes
#* <int> <int> <fct> <fct> <int>
#1 1 4 yes yes 2
#2 2 5 no yes 1
#3 3 6 no <NA> 0
靈感來自這個答案。
rowwise
與c_across
:df %>%
rowwise() %>%
mutate(sum_yes = sum(c_across(all_of(text_cols)) == "yes"))
do
與rowwise
df %>%
rowwise() %>%
do((.) %>% as.data.frame %>%
mutate(sum_yes = sum(.=="yes")))
do
和rowwise
df %>%
select(text_cols) %>%
mutate(sum_yes = rowSums(. == "yes"))
df$sum_yes <- rowSums(df[text_cols] == "yes")
我們也可以使用reduce
和map
library(tidyverse)
df %>%
select(text_cols) %>%
map(~ .x == "yes" & !is.na(.x)) %>%
reduce(`+`) %>%
bind_cols(df, sum_yes = .)
# num_col_1 num_col_2 text_col_1 text_col_2 sum_yes
#1 1 4 yes yes 2
#2 2 5 no yes 1
#3 3 6 no <NA> 0
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.