结合data.frame和data.frames列表，在R中没有公共变量

Question

I have a data frame (D) and a list of data frames (L) that I want to combine into a new data frame. 我有一个数据框（D）和一个数据框列表（L），我想将它们组合成一个新的数据框。 There is one row in D for every data frame in L, and I want to join these data together so that each row in D is matched with the corresponding data frame in L and replicated across each row. 对于L中的每个数据帧，D中都有一行，我想将这些数据连接在一起，以便D中的每一行都与L中的相应数据帧匹配，并在每一行中复制。 The data frames in L have varying row numbers, but they all have the same columns and could easily be combined into a single data frame (eg, using plyr::rbind.fill ). L中的数据帧具有不同的行号，但是它们都具有相同的列，并且可以轻松地组合为单个数据帧（例如，使用plyr::rbind.fill ）。 There are no common variables between D and the data frames in L - the only way I know which rows go together is by the order in which they appear in D and L. D和L中的数据帧之间没有公共变量-我知道哪些行在一起的唯一方法是按它们在D和L中出现的顺序。

Here is toy data with the same structure as my data: 这是与我的数据具有相同结构的玩具数据：

# the data frame
D <- data.frame(name = c("john","sally","ben"), age = c(23, 31, 27))

# the list of data frames
john <- data.frame(attempt = 1:3, result = c("fail","fail","fail"))
sally <- data.frame(attempt = 1, result = c("success"))
ben <- data.frame(attempt = 1:5, result = c("fail","fail","success","fail","success"))
L <- list(john, sally, ben)

The dumb way I have tried to do this is with a for loop: 我试图做到这一点的愚蠢方法是使用for循环：

# loop to combine data frame and list
new_D <- data.frame()
for (i in 1:nrow(D)) {
    add <- cbind(D[i,], L[[i]])
    new_D <- rbind(new_D, add)
}

It works, but it is very slow and my files are quite large, so it is not practical. 它可以工作，但是非常慢，而且我的文件很大，因此不切实际。 What is a cleaner and more efficient way to do this in R? 在R中，有什么更干净，更有效的方法来做到这一点？

Answer 1

Name the list elements, convert the list to a single data.table with an index column ("name"), join with the original data on the "name" column: 为列表元素命名，将列表转换为单个带索引列（“名称”）的data.table ，并与“名称”列上的原始数据连接：

names(L) <- D$name
D2 <- data.table::rbindlist(L, use.names = TRUE, idcol = "name")  
D2[D, on = "name"]
#     name attempt  result age
# 1:  john       1    fail  23
# 2:  john       2    fail  23
# 3:  john       3    fail  23
# 4: sally       1 success  31
# 5:   ben       1    fail  27
# 6:   ben       2    fail  27
# 7:   ben       3 success  27
# 8:   ben       4    fail  27
# 9:   ben       5 success  27

Answer 2

We can do a split by sequence of row and then with Map cbind the datasets 我们可以按行顺序进行split ，然后使用Map cbind数据集

do.call(rbind, Map(cbind, split(D, seq_len(nrow(D))), L))

Or set the names of 'L' with the paste ed rows of 'D', bind the rows and separate into two columns 或一组与“L”的名称paste “d”的编排，结合行和separate成两列

library(tidyverse)
do.call(paste, c(D, sep = ",")) %>%
     set_names(L, .) %>%
     bind_rows(.id = 'grp') %>% 
     separate(grp, into = c('name', 'age'))

结合data.frame和data.frames列表，在R中没有公共变量

问题描述

2 个解决方案

解决方案1
4 已采纳 2018-05-26 15:55:06

解决方案2
2 2018-05-26 15:41:55

结合data.frame和data.frames列表，在R中没有公共变量

问题描述

2 个解决方案

解决方案1 4 已采纳 2018-05-26 15:55:06

解决方案2 2 2018-05-26 15:41:55

解决方案1
4 已采纳 2018-05-26 15:55:06

解决方案2
2 2018-05-26 15:41:55