簡體   English   中英

在R中搜索一個data.frame

[英]Searching a data.frame in R

我有類似於以下簡化的data.frame:

ddf
  id                  country          area
1  1 United States of America North America
2  2           United Kingdom        Europe
3  3     United Arab Emirates          Arab
4  4             Saudi Arabia          Arab
5  5                   Brazil South America

ddf = structure(list(id = 1:5, country = c("United States of America", 
"United Kingdom", "United Arab Emirates", "Saudi Arabia", "Brazil"
), area = c("North America", "Europe", "Arab", "Arab", "South America"
)), .Names = c("id", "country", "area"), class = "data.frame", row.names = c(NA, 
-5L))

我想打印所有包含“美國”(不區分大小寫)文本的行

  1. 任何列
  2. 名稱為“區域”的列

行和列的數量以及列名是可變的,所以我不能使用ddf [,1]等。

我嘗試了以下操作,但不起作用:

ddf[apply(ddf, 1, function(x) grepl('america',x, ignore.case=T) ),]
   id              country   area
2   2       United Kingdom Europe
3   3 United Arab Emirates   Arab
NA NA                 <NA>   <NA>

這是使用qdap軟件包的方法:

library(qdap)
Search(ddf, "america")

##   id                  country          area
## 1  1 United States of America North America
## 5  5                   Brazil South America

請查看源代碼以獲取有關其工作方式的更多信息。

對於第二個請求...

Search(ddf, "america", "area")

在基數R中:

ddf[do.call(mapply,c(any,lapply(ddf,grepl,pattern="america",ignore.case=TRUE))),]

#  id                  country          area
#1  1 United States of America North America
#5  5                   Brazil South America
 hasAm <-  sapply( ddf, grepl, patt="america", ignore.case=TRUE)
 ddf[ rowSums(hasAm) > 0 , ]
  id                  country          area
1  1 United States of America North America
5  5                   Brazil South America

第一個值hasAm只是數據幀的邏輯“圖像”,第二行通過邏輯索引在存在TRUE的任何行中傳遞。

建議我取消刪除此答案,所以就在這里。

另一種使用mapply

> m <- mapply(grep, "america", ddf, ignore.case = TRUE)
> ddf[unique(unlist(m)), ]
#   id                  country          area
# 1  1 United States of America North America
# 5  5                   Brazil South America

您還可以以相同方式使用lapplysapply

> s <- sapply(ddf, grep, pattern = "america", ignore.case = TRUE)
> ddf[unique(unlist(s)), ]
#   id                  country          area
# 1  1 United States of America North America
# 5  5                   Brazil South America

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM