简体   繁体   English

R 数据帧:select 行在按名称索引的多列(变量)上满足逻辑条件

[英]R data frame: select rows that meet logical conditions over multiple columns (variables) indexed by name

Ok this example should clarify what I am looking for好的,这个例子应该澄清我在找什么

set.seed(123456789)

df <- data.frame(
  x1 = sample(c(0,1), size = 10, replace = TRUE),
  x2 = sample(c(0,1), size = 10, replace = TRUE),
  z1 = sample(c(0,1), size = 10, replace = TRUE)
  )

I want to select all rows that have x1 and x2 =1.我想 select 所有具有 x1 和 x2 =1 的行。 That is,那是,

df[df$x1==1 & df$x2==1,]

which returns返回

   x1 x2 z1
1   1  1  1
4   1  1  1
6   1  1  1
10  1  1  0

but I want to do it in a way that scales to many x variables (eg x1,x2,...x40), so I would like to index the columns by "x" rather than having to write df$x1==1 & df$x2==1 &... & df$x40==1.但我想以一种可以扩展到许多 x 变量(例如 x1,x2,...x40)的方式来做,所以我想用“x”索引列而不是写 df$x1==1 & df$x2==1 &... & df$x40==1。 Note that I care about having the z1 variable in the resulting data set (ie while the rows are selected based on the x variables, I am not looking to select the x columns only).请注意,我关心在结果数据集中有 z1 变量(即,虽然根据 x 变量选择行,但我不希望 select 仅 x 列)。 Is it possible?可能吗?

A possible solution, based on dplyr :基于dplyr的可能解决方案:

library(dplyr)

set.seed(123456789)

df <- data.frame(
  x1 = sample(c(0,1), size = 10, replace = TRUE),
  x2 = sample(c(0,1), size = 10, replace = TRUE),
  z1 = sample(c(0,1), size = 10, replace = TRUE)
)

df %>% 
  filter(across(starts_with("x"), ~ .x == 1))

#>   x1 x2 z1
#> 1  1  1  1
#> 2  1  1  1
#> 3  1  1  1
#> 4  1  1  0

Here is a base R way with Reduce applied to the data.frame's rows.这是一个基本的 R 方式, Reduce应用于 data.frame 的行。

cols <- grep("^x", names(df))

i <- apply(df[cols], 1, \(x) Reduce(`&`, x == 1L))
df[i,]
#   x1 x2 z1
#1   1  1  1
#4   1  1  1
#6   1  1  1
#10  1  1  0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM