在R中按ID和DATE合并两个数据帧列表

Question

I need to merge two lists of data frames by two key variables, ID and DATE. 我需要通过两个关键变量ID和DATE合并两个数据帧列表。 Here is an example of the data that I have: 这是我拥有的数据的示例：

 names1 <- c("df1", "df2")
 mydf1 <- data.frame(ID=c(115477, 115477), DATE=c("2012-01-31","2012-02-   29"), SCORE =c(677,635)) 
 mydf2 <- data.frame(ID=c(22319, 22319), DATE=c("2011-09-30","2011-10-31"), SCORE = c(621,630))
 list1 <- list(mydf1,mydf2)
 names(list1) <- names1

 names2 <- c("df_auto1", "df_auto2")
 mydf_auto1 <- data.frame(ID=c(22319, 22319),DATE=c("2011-09-30","2011-10-31") , Fprice =c(8708,8708)) 
 mydf_auto2 <- data.frame(ID=c(115477, 115477), DATE=c("2012-01-31","2012-02-29"), Fprice = c(NA,6543))
 list2 <- list(mydf_auto1,mydf_auto2)
 names(list2) <- names2

I tried to use Map function but the output I got is messed up. 我尝试使用Map函数，但输出混乱。 Here is what I tried to do: 这是我尝试做的事情：

 V <-Map(merge, list1, list2,MoreArgs=list(by=c('ID','DATE'), all=TRUE))

 for (i in seq_along(V)) {
 write.csv(V[[i]], paste0("merge_",i, ".csv"))
 }

As the final output, I'd like to get one dataframe with ID = 115477 and fully populated variables such as DATE, SCORE and Fprice; 作为最终输出，我想获得一个ID = 115477的数据帧，并填充完整的变量，例如DATE，SCORE和Fprice； another dataframe with ID = 22319 and fully populated as well. 另一个ID为22319并已完全填充的数据框。 For example, for ID = 115477 I'd like to get: 例如，对于ID = 115477，我想获得：

  ID        DATE          SCORE    Fprice
 115477    2012-01-31     677     NA
 115477    2012-02-29     635     6543

Does anyone have any idea of what I am doing wrong? 有人知道我在做什么错吗？ Thank you for your help. 谢谢您的帮助。

Answer 1

Here is a tidyverse approach: 这是一个tidyverse方法：

library(tidyverse);
list(bind_rows(list1), bind_rows(list2)) %>%
    reduce(function(x, y) full_join(x, y, by = c("ID", "DATE"))) %>%
    filter(ID %in% c(115477))
#      ID       DATE SCORE Fprice
#1 115477 2012-01-31   677     NA
#2 115477 2012-02-29   635   6543

Explanation: For each list we bind rows into a single data.frame ; 说明：对于每个list我们将行绑定到单个data.frame ； we collect the two collapsed data.frame s in a list and then perform an outer join by "ID" and "DATE" ; 我们将两个折叠的data.frame收集到一个list ，然后通过"ID"和"DATE"执行外部data.frame ； we use dplyr::filter to pull out the rows of interest (here ID==115477 ). 我们使用dplyr::filter提取感兴趣的行（此处ID==115477 ）。

Answer 2

Overview 概观

Conduct the merge() inside of mapply() . 进行merge()内mapply()

The end result is a list containing two data frames, each one the result of j ^th element in list2 being outer joined onto the i ^th element in list1 . 最终结果是一个包含两个数据帧的列表，每个数据帧的结果是list2中的^第 j ^个元素外部连接到list1 ^第 i ^个元素。

Note: There was a typo in the second DATE element within mydf1 that is corrected below. 注意： mydf1中第二个DATE元素中有一个错字，下面对此进行了更正。 My answer depends on the contents of list1 and list2 possessing data frames that contain the same ID value, in the same order. 我的回答取决于list1和list2的内容，这些内容具有按相同顺序包含相同ID值的数据帧。 As the OP has it arranged, mydf_auto2 is set to be merged onto mydf1 ; 按照OP的安排，将mydf_auto2设置为合并到mydf1 ； whereas mydf_auto2 should be merged onto mydf2 based on these two data frames sharing the same ID value. 而mydf_auto2应合并到mydf2基于共享相同的这两个数据帧ID值。 I revise the ordering within list2 to produce the desired output. 我修改list2内的顺序以产生所需的输出。

# create first list of data frames
names1 <- c("df1", "df2")
# note the extra spacing in "2012-02-29" has been corrected
mydf1 <- data.frame(ID=c(115477, 115477), DATE=c("2012-01-31","2012-02-29"), SCORE =c(677,635)) 
mydf2 <- data.frame(ID=c(22319, 22319), DATE=c("2011-09-30","2011-10-31"), SCORE = c(621,630))
list1 <- list(mydf1,mydf2)
names(list1) <- names1

# create second list of data frames
names2 <- c("df_auto1", "df_auto2")
# here is where I relabel the data frames
# to sync with `mydf1` and `mydf2` based on 
# the `ID` values contained in `mydf_auto1` and `mydf_auto2`
mydf_auto1 <- data.frame(ID=c(115477, 115477), DATE=c("2012-01-31","2012-02-29"), Fprice = c(NA,6543))
mydf_auto2 <- data.frame(ID=c(22319, 22319),DATE=c("2011-09-30","2011-10-31") , Fprice =c(8708,8708)) 
list2 <- list(mydf_auto1,mydf_auto2)
names(list2) <- names2

# merge the list of data frames together
merged.list.of.dfs <-
  mapply( FUN = function( i, j )
    merge( x = i
           , y = j
           , by = c( "ID", "DATE" )
           , all = TRUE )
    , list1
    , list2
    , SIMPLIFY = FALSE )

# view results
merged.list.of.dfs
# $df1
#       ID       DATE SCORE Fprice
# 3 115477 2012-01-31   677     NA
# 4 115477 2012-02-29   635   6543
# 
# $df2
#      ID       DATE SCORE Fprice
# 1 22319 2011-09-30   621   8708
# 2 22319 2011-10-31   630   8708

# end of script #

Answer 3

It would be easier for you to do a merge , then separately extract the IDs you want 您进行merge会更容易，然后分别提取所需的ID

names1 <- c("df1", "df2")
mydf1 <- data.frame(ID=c(115477, 115477), DATE=c("2012-01-31","2012-02-29"), SCORE =c(677,635)) 
mydf2 <- data.frame(ID=c(22319, 22319), DATE=c("2011-09-30","2011-10-31"), SCORE = c(621,630))
# Note the change to use of rbind instead of list
list1 <- rbind(mydf1, mydf2)

names2 <- c("df_auto1", "df_auto2")
mydf_auto1 <- data.frame(ID=c(22319, 22319),DATE=c("2011-09-30","2011-10-31") , Fprice =c(8708,8708)) 
mydf_auto2 <- data.frame(ID=c(115477, 115477), DATE=c("2012-01-31","2012-02-29"), Fprice = c(NA,6543))
list2 <- rbind(mydf_auto1,mydf_auto2)

df <- merge(list1, list2, by = c("ID", "DATE"))
df[df$ID == 115477,]
df[df$ID == 22319, ]

在R中按ID和DATE合并两个数据帧列表

问题描述

3 个解决方案

解决方案1
1 2018-04-11 00:52:44

解决方案2
1 已采纳 2018-04-11 00:59:01

Overview 概观

解决方案3
0 2018-04-11 00:27:25

在R中按ID和DATE合并两个数据帧列表

问题描述

3 个解决方案

解决方案1 1 2018-04-11 00:52:44

解决方案2 1 已采纳 2018-04-11 00:59:01

Overview 概观

解决方案3 0 2018-04-11 00:27:25

解决方案1
1 2018-04-11 00:52:44

解决方案2
1 已采纳 2018-04-11 00:59:01

解决方案3
0 2018-04-11 00:27:25