简体   繁体   English

在数据框中查找字符串并选择带有 R 的行

[英]Look for a string in a data frame and select the rows with R

I have a table like this我有一张这样的桌子

Column A A栏 Column B B栏 Column C C栏 Column D D栏
A一个 x X 1 1 k1 k1
B k ķ 2 2 k2 k2
C C z z 3 3 k3 k3
D D y是的 4 4 k4 k4

I would like to write a script which selects a specific string and gives me back that row.我想编写一个脚本来选择一个特定的字符串并将该行返回给我。 For example, I want to see the rows which contains "A", which can be in every column.例如,我想查看包含“A”的行,它可以在每一列中。 I tried str_detect, but you have to specify the column in the data frame which I don't want to.我尝试了 str_detect,但您必须在数据框中指定我不想指定的列。 Also it would be perfect to have it selecting different strings, like look for "A", "3" and "y" with this output:让它选择不同的字符串也是完美的,比如用这个输出查找“A”、“3”和“y”:

Column A A栏 Column B B栏 Column C C栏 Column D D栏
A一个 x X 1 1 k1 k1
C C z z 3 3 k3 k3
D D y是的 4 4 k4 k4
vec <- c('A', 3, 'y') 
df[rowsum(+(as.matrix(df) %in% vec), c(row(df)))>0,]

  ColumnA ColumnB ColumnC ColumnD
1       A       x       1      k1
3       C       z       3      k3
4       D       y       4      k4

Another way is useing regex:另一种方法是使用正则表达式:

df[grepl(sprintf('\\b(%s)\\b', paste0(vec, collapse = '|')), do.call(paste, df)), ]
  ColumnA ColumnB ColumnC ColumnD
1       A       x       1      k1
3       C       z       3      k3
4       D       y       4      k4

We could use if_any with str_detect我们可以使用if_anystr_detect

library(dplyr)
library(stringr)
df1 %>% 
   filter(if_any(everything(), ~ str_detect(.x, 'A|3|y')))

-output -输出

  ColumnA ColumnB ColumnC ColumnD
1       A       x       1      k1
2       C       z       3      k3
3       D       y       4      k4

Or using base R with a non-regex solution或者使用带有非正则表达式解决方案的base R

subset(df1, Reduce(`|`, lapply(df1, \(x) x %in% c('A', 3, 'y'))))
  ColumnA ColumnB ColumnC ColumnD
1       A       x       1      k1
3       C       z       3      k3
4       D       y       4      k4

data数据

df1 <- structure(list(ColumnA = c("A", "B", "C", "D"), ColumnB = c("x", 
"k", "z", "y"), ColumnC = 1:4, ColumnD = c("k1", "k2", "k3", 
"k4")), class = "data.frame", row.names = c(NA, -4L))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM