在R中查找列表元素

Question

Right now I'm working with a character vector in R, that i use strsplit to separate word by word. 现在我正在使用R中的字符向量，我使用strsplit逐字分离。 I'm wondering if there's a function that I can use to check the whole list, and see if a specific word is in the list, and (if possible) say which elements of the list it is in. 我想知道是否有一个函数可以用来检查整个列表，看看列表中是否有特定的单词，并且（如果可能的话）说出它所在的列表中的哪些元素。

ex. 恩。

a = c("a","b","c")
b= c("b","d","e")
c = c("a","e","f")

If z=list(a,b,c) , then f("a",z) would optimally yield [1] 1 3 , and f("b",z) would optimally yield [1] 1 2 如果z=list(a,b,c) ，那么f("a",z)将最佳地产生[1] 1 3 ，并且f("b",z)将最佳地产生[1] 1 2

Any assistance would be wonderful. 任何帮助都会很精彩。

Answer 1

As alexwhan says, grep is the function to use. 正如alexwhan所说， grep是要使用的功能。 However, be careful about using it with a list. 但是，请注意将其与列表一起使用。 It isn't doing what you might think it's doing. 它没有做你认为它正在做的事情。 For example: 例如：

grep("c", z)
[1] 1 2 3   # ?

grep(",", z)
[1] 1 2 3   # ???

What's happening behind the scenes is that grep coerces its 2nd argument to character, using as.character . 幕后发生的事情是grep使用as.character其第二个参数为character。 When applied to a list, what as.character returns is the character representation of that list as obtained by deparsing it . 当应用于列表时， as.character返回的是通过解压缩获得的该列表的字符表示。 (Modulo an unlist.) （Modulo unlist。）

as.character(z)
[1] "c(\"a\", \"b\", \"c\")" "c(\"b\", \"d\", \"e\")" "c(\"a\", \"e\", \"f\")"

cat(as.character(z))
c("a", "b", "c") c("b", "d", "e") c("a", "e", "f")

This is what grep is working on. 这就是grep正在努力的方面。

If you want to run grep on a list, a safer method is to use lapply . 如果要在列表上运行grep ，更安全的方法是使用lapply 。 This returns another list, which you can operate on to extract what you're interested in. 这将返回另一个列表，您可以对其进行操作以提取您感兴趣的内容。

res <- lapply(z, function(ch) grep("a", ch))
res
[[1]]
[1] 1

[[2]]
integer(0)

[[3]]
[1] 1


# which vectors contain a search term
sapply(res, function(x) length(x) > 0)
[1]  TRUE FALSE  TRUE

Answer 2

Much faster than grep is: 比grep快得多：

sapply(x, function(y) x %in% y)

and if you want the index of course just use which(): 如果你想索引当然只是使用which（）：

which(sapply(x, function(y) x %in% y))

Evidence! 证据！

x = setNames(replicate(26, list(sample(LETTERS, 10, rep=T))), sapply(LETTERS, list))

head(x)

$A
 [1] "A" "M" "B" "X" "B" "J" "P" "L" "M" "L"

$B
 [1] "H" "G" "F" "R" "B" "E" "D" "I" "L" "R"

$C
 [1] "P" "R" "C" "N" "K" "E" "R" "S" "N" "P"

$D
 [1] "F" "B" "B" "Z" "E" "Y" "J" "R" "H" "P"

$E
 [1] "O" "P" "E" "X" "S" "Q" "S" "A" "H" "B"

$F
 [1] "Y" "P" "T" "T" "P" "N" "K" "P" "G" "P"

system.time(replicate(1000, grep("A", x)))

   user  system elapsed 
   0.11    0.00    0.11 

system.time(replicate(1000, sapply(x, function(y) "A" %in% y)))

   user  system elapsed 
   0.05    0.00    0.05

Answer 3

You're looking for grep() : 你正在寻找grep() ：

grep("a", z)
#[1] 1 3

grep("b", z)
#[1] 1 2

在R中查找列表元素

问题描述

3 个解决方案

解决方案1
19 已采纳 2013-06-28 06:38:04

解决方案2
7 2018-10-09 10:05:56

解决方案3
6 2013-06-28 06:12:36

在R中查找列表元素

问题描述

3 个解决方案

解决方案1 19 已采纳 2013-06-28 06:38:04

解决方案2 7 2018-10-09 10:05:56

解决方案3 6 2013-06-28 06:12:36

解决方案1
19 已采纳 2013-06-28 06:38:04

解决方案2
7 2018-10-09 10:05:56

解决方案3
6 2013-06-28 06:12:36