在R中搜索一个data.frame

Question

I have data.frame similar to following simplified one: 我有类似于以下简化的data.frame：

ddf
  id                  country          area
1  1 United States of America North America
2  2           United Kingdom        Europe
3  3     United Arab Emirates          Arab
4  4             Saudi Arabia          Arab
5  5                   Brazil South America

ddf = structure(list(id = 1:5, country = c("United States of America", 
"United Kingdom", "United Arab Emirates", "Saudi Arabia", "Brazil"
), area = c("North America", "Europe", "Arab", "Arab", "South America"
)), .Names = c("id", "country", "area"), class = "data.frame", row.names = c(NA, 
-5L))

I want to print all rows where the text 'america' (case-insensitive) comes in 我想打印所有包含“美国”（不区分大小写）文本的行

any column 任何列
column with name 'area' 名称为“区域”的列

The number of rows and columns and the column names are variable so I cannot use ddf[,1] etc. 行和列的数量以及列名是可变的，所以我不能使用ddf [，1]等。

I tried following but it is not working: 我尝试了以下操作，但不起作用：

ddf[apply(ddf, 1, function(x) grepl('america',x, ignore.case=T) ),]
   id              country   area
2   2       United Kingdom Europe
3   3 United Arab Emirates   Arab
NA NA                 <NA>   <NA>

Answer 1

Here's an approach using qdap package: 这是使用qdap软件包的方法：

library(qdap)
Search(ddf, "america")

##   id                  country          area
## 1  1 United States of America North America
## 5  5                   Brazil South America

Have a look at the source code for more info on how it works. 请查看源代码以获取有关其工作方式的更多信息。

For the second request... 对于第二个请求...

Search(ddf, "america", "area")

Answer 2

In base R: 在基数R中：

ddf[do.call(mapply,c(any,lapply(ddf,grepl,pattern="america",ignore.case=TRUE))),]

#  id                  country          area
#1  1 United States of America North America
#5  5                   Brazil South America

Answer 3

 hasAm <-  sapply( ddf, grepl, patt="america", ignore.case=TRUE)
 ddf[ rowSums(hasAm) > 0 , ]
  id                  country          area
1  1 United States of America North America
5  5                   Brazil South America

The first value, hasAm is just a logical 'image' of the dataframe, which the second line delivers via logical indexing any row where there is a TRUE. 第一个值hasAm只是数据帧的逻辑“图像”，第二行通过逻辑索引在存在TRUE的任何行中传递。

Answer 4

It was suggested I un-delete this answer, so here it is. 建议我取消删除此答案，所以就在这里。

One other way that uses mapply 另一种使用mapply

> m <- mapply(grep, "america", ddf, ignore.case = TRUE)
> ddf[unique(unlist(m)), ]
#   id                  country          area
# 1  1 United States of America North America
# 5  5                   Brazil South America

You can also use lapply and sapply in the same manner 您还可以以相同方式使用lapply和sapply

> s <- sapply(ddf, grep, pattern = "america", ignore.case = TRUE)
> ddf[unique(unlist(s)), ]
#   id                  country          area
# 1  1 United States of America North America
# 5  5                   Brazil South America

在R中搜索一个data.frame

问题描述

4 个解决方案

解决方案1
1 已采纳 2014-09-12 02:02:47

解决方案2
1 2014-09-12 02:08:17

解决方案3
1 2014-09-12 02:09:17

解决方案4
1 2014-09-12 02:11:20

在R中搜索一个data.frame

问题描述

4 个解决方案

解决方案1 1 已采纳 2014-09-12 02:02:47

解决方案2 1 2014-09-12 02:08:17

解决方案3 1 2014-09-12 02:09:17

解决方案4 1 2014-09-12 02:11:20

解决方案1
1 已采纳 2014-09-12 02:02:47

解决方案2
1 2014-09-12 02:08:17

解决方案3
1 2014-09-12 02:09:17

解决方案4
1 2014-09-12 02:11:20