简体   繁体   English

R-使用索引和列表内容合并两个数据框[R]

[英]R - use indices and content of list to merge two dataframes [R]

Perhaps bloody obvious, but new to R. My two dataframes to be merged: 也许是血腥的,但对R来说是新的。我要合并的两个数据框:

longtext <- c("bla bla burp bla blub", "blah bladd", "blablaz burp")
txt <- data.frame(longtext)
queries <- c("burp", "blah")
query <- data.frame(queries)

I performed a search for the strings in query within the longer text strings in txt . 我在txt的较长文本字符串中搜索了query中的字符串。 The matches were saved in a list of style: 匹配项保存在样式列表中:

matches <-list(c(1,3), c(2))

The first index of the list matches , eg [[1]] refers to the first row in query . 列表的第一个索引matches ,例如[[1]]引用query的第一行。 The content of matches in the first row (1,3) refers to search hits row 1 and 3 in txt . 第一行(1,3)中的matches内容是指txt搜索匹配行1和3。 So I want to merge both dataframes by using the indices and content of matches to get: 所以我想通过使用索引和matches内容来合并两个数据框,以获得:

queries; longtext        
"burp"; "bla bla burp blah blub"
"burp"; "blablaz burp"
"blah"; "blah bladd"

But... my loop over indices and content doesn't work. 但是...我对索引和内容的循环不起作用。 Is there an easier way with apply() ? 有一个更简单的方法apply()吗? Will feed with lot's of data... 将提供大量数据...

matches_long <- data.frame()  
for (i in 1:length(matches)) {
  for (l in 1:length(matches[[i]])) {
    matches_long[[l]] <- data.frame(query[[i]], txt[[matches[[i]][l]]])}}  

Seems to me like you could just add rows to your data set according to the size of matches and then just assign the matched values 在我看来,您可以根据matches的大小将行添加到数据集中,然后分配匹配的值

res <- query[rep(seq_along(matches), sapply(matches, length)),, drop = FALSE] 
res["longtext"] <- txt$longtext[unlist(matches)]
res
#     queries              longtext
# 1      burp bla bla burp bla blub
# 1.1    burp          blablaz burp
# 2      blah            blah bladd
  • in R v 3.2+ you could replace sapply(matches, length) with lengths 在R v 3.2+你可以取代sapply(matches, length)lengths

@David Arenburgs answer is better, but as I was about to paste this in: @David Arenburgs的答案更好,但是正如我正要粘贴的那样:

names(matches) <- queries
stack(lapply(matches, function(x){longtext[x]}))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM