[英]R: Compute logical OR over subset of columns with matches
我有這個數據框:
df <- tibble(id = c(1, 2, 3), c_1 = c(T, T, F), c_2 = c(F, F, T)) %>% group_by(id)
# A tibble: 3 x 3
id c_1 c_2
<dbl> <lgl> <lgl>
1 1 TRUE FALSE
2 2 TRUE FALSE
3 3 FALSE TRUE
我現在想計算以c_
開頭的列的行邏輯 OR 我試過
df %>% mutate(valid = sum(select(matches("^c_")) == 0))
但是我得到
`matches()` must be used within a *selecting* function.
我該如何解決這個問題?
library(dplyr)
df <- tibble(id = c(1, 2, 3), c_1 = c(T, T, F), c_2 = c(F, F, T))
df %>%
rowwise() %>%
mutate(
valid = any(c_across(starts_with("c_")))
) %>%
ungroup()
#> # A tibble: 3 × 4
#> id c_1 c_2 valid
#> <dbl> <lgl> <lgl> <lgl>
#> 1 1 TRUE FALSE TRUE
#> 2 2 TRUE FALSE TRUE
#> 3 3 FALSE TRUE TRUE
由reprex 包於 2022-07-11 創建 (v2.0.1)
更新:為什么需要 tibble():
沒有as_tibble()
或 tibble( tibble()
或data.frame()
它將無法工作:
你的桌子:
> class(df)
[1] "grouped_df" "tbl_df" "tbl" "data.frame"
as_tibble()
或 tibble( tibble()
或data.frame()
-> 將無法工作:>df %>%
mutate(valid = ifelse(rowSums(select(., contains("c_")))==1, TRUE, FALSE))
Adding missing grouping variables: `id`
Error in `mutate()`:
! Problem while computing `valid = ifelse(rowSums(select(.,
contains("c_"))) == 1, TRUE, FALSE)`.
x `valid` must be size 1, not 3.
i The error occurred in group 1: id = 1.
as_tibble()
或 tibble( tibble()
或data.frame()
-> 它將起作用:df %>%
data.frame() %>%
mutate(valid = ifelse(rowSums(select(., contains("c_")))==1, TRUE, FALSE))
#or
df %>%
tibble() %>%
mutate(valid = ifelse(rowSums(select(., contains("c_")))==1, TRUE, FALSE))
第一個答案:如果我們想使用 select: 這是一個開箱即用的方法:
library(tibble)
library(dplyr)
df %>%
as_tibble() %>%
mutate(valid = ifelse(rowSums(.[2:3])==1, TRUE, FALSE))
或者
library(tibble)
library(dplyr)
df %>%
as_tibble() %>%
mutate(valid = ifelse(rowSums(select(., contains("c_")))==1, TRUE, FALSE))
# A tibble: 3 x 4
id c_1 c_2 valid
<dbl> <lgl> <lgl> <lgl>
1 1 TRUE FALSE TRUE
2 2 TRUE FALSE TRUE
3 3 FALSE TRUE TRUE
我們可以直接使用if_any
而不用 rowwise
library(dplyr)
df %>%
mutate(valid = if_any(starts_with('c_')))
# A tibble: 3 × 4
id c_1 c_2 valid
<dbl> <lgl> <lgl> <lgl>
1 1 TRUE FALSE TRUE
2 2 TRUE FALSE TRUE
3 3 FALSE TRUE TRUE
df <- tibble(id = c(1, 2, 3), c_1 = c(TRUE, TRUE, FALSE),
c_2 = c(FALSE, FALSE, TRUE))
使用grepl
base R
選項:
library(dplyr) # For tibble
df <- tibble(id = c(1, 2, 3), c_1 = c(T, T, F), c_2 = c(F, F, T)) %>% group_by(id)
df$valid <- apply(df, 1, function(x) any(x %in% grepl("c_" , names(x))))
df
#> # A tibble: 3 × 4
#> # Groups: id [3]
#> id c_1 c_2 valid
#> <dbl> <lgl> <lgl> <lgl>
#> 1 1 TRUE FALSE TRUE
#> 2 2 TRUE FALSE TRUE
#> 3 3 FALSE TRUE TRUE
由reprex 包於 2022-07-11 創建 (v2.0.1)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.