[英]R - use indices and content of list to merge two dataframes [R]
Perhaps bloody obvious, but new to R. My two dataframes to be merged: 也许是血腥的,但对R来说是新的。我要合并的两个数据框:
longtext <- c("bla bla burp bla blub", "blah bladd", "blablaz burp")
txt <- data.frame(longtext)
queries <- c("burp", "blah")
query <- data.frame(queries)
I performed a search for the strings in query
within the longer text strings in txt
. 我在
txt
的较长文本字符串中搜索了query
中的字符串。 The matches were saved in a list of style: 匹配项保存在样式列表中:
matches <-list(c(1,3), c(2))
The first index of the list matches
, eg [[1]] refers to the first row in query
. 列表的第一个索引
matches
,例如[[1]]引用query
的第一行。 The content of matches
in the first row (1,3) refers to search hits row 1 and 3 in txt
. 第一行(1,3)中的
matches
内容是指txt
搜索匹配行1和3。 So I want to merge both dataframes by using the indices and content of matches
to get: 所以我想通过使用索引和
matches
内容来合并两个数据框,以获得:
queries; longtext
"burp"; "bla bla burp blah blub"
"burp"; "blablaz burp"
"blah"; "blah bladd"
But... my loop over indices and content doesn't work. 但是...我对索引和内容的循环不起作用。 Is there an easier way with
apply()
? 有一个更简单的方法
apply()
吗? Will feed with lot's of data... 将提供大量数据...
matches_long <- data.frame()
for (i in 1:length(matches)) {
for (l in 1:length(matches[[i]])) {
matches_long[[l]] <- data.frame(query[[i]], txt[[matches[[i]][l]]])}}
Seems to me like you could just add rows to your data set according to the size of matches
and then just assign the matched values 在我看来,您可以根据
matches
的大小将行添加到数据集中,然后分配匹配的值
res <- query[rep(seq_along(matches), sapply(matches, length)),, drop = FALSE]
res["longtext"] <- txt$longtext[unlist(matches)]
res
# queries longtext
# 1 burp bla bla burp bla blub
# 1.1 burp blablaz burp
# 2 blah blah bladd
sapply(matches, length)
with lengths
sapply(matches, length)
与lengths
@David Arenburgs answer is better, but as I was about to paste this in: @David Arenburgs的答案更好,但是正如我正要粘贴的那样:
names(matches) <- queries
stack(lapply(matches, function(x){longtext[x]}))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.