[英]Merge two data frames by id and date
我有兩個數據框。 A:
id date
1 2010-05-08
2 2012-08-08
3 2013-06-23
乙:
id date1
1 2010-05-09
2 2012-08-08
我需要通過 id 以及表 2 中的日期 = 表 1 中的日期 +1 天來合並這兩個數據框。 進一步標記合並為 TRUE 的行。
最后的output是A:
id date date1 flag
1 2010-05-08 2010-05-09 1
2 2012-08-08 NA NA
3 2013-06-23 NA NA
生成數據的代碼-
A <- data.frame(customer = c(1,2,3),
application_date = c("2010-05-08", "2012-08-08", "2013-06-23"))
B <- data.frame(customer = c(1,2),
application_date = c("2010-05-09", "2012-08-08"))
這個怎么樣?
數據:
A <- data.frame(customer = c(1,2,3),
application_date = c("2010-05-08", "2012-08-08", "2013-06-23"))
B <- data.frame(customer = c(1,2),
application_date = c("2010-05-09", "2012-08-08"))
DPLYR :
library(dplyr)
data <- left_join(A, B, by = "customer")
data %>%
mutate(logic = if_else(as.Date(data$application_date.x) + 1 == as.Date(data$application_date.y), 1, 0)) %>%
rename("id" = "customer",
"date" = "application_date.x",
"date1" = "application_date.y",
"flag" = "logic")
Output :
id date date1 flag
1 2010-05-08 2010-05-09 1
2 2012-08-08 2012-08-08 0
3 2013-06-23 <NA> NA
DATA.TABLE :
library(data.table)
data_2 <- merge.data.table(A, B, by = "customer", all.x=TRUE)
data_2[, logic:= (ifelse(as.Date(data$application_date.x) + 1 == as.Date(data$application_date.y), 1, 0))]
setnames(data_2, old = c("customer", "application_date.x", "application_date.y", "logic"),
new = c("id", "date", "date1", "flag"))
Output :
id date date1 flag
1 2010-05-08 2010-05-09 1
2 2012-08-08 2012-08-08 0
3 2013-06-23 <NA> NA
如果您不介意直接通過引用更新 A,這里有一個更新連接選項:
library(data.table)
setDT(A)[, d := dateA + 1L]
setDT(B)
A[B, on=.(id, d=dateB), c("dateB", "flag") := .(dateB, 1L)]
數據:
A <- data.frame(id = c(1,2,3),
dateA = as.Date(c("2010-05-08", "2012-08-08", "2013-06-23")))
B <- data.frame(id = c(1,2),
dateB = as.Date(c("2010-05-09", "2012-08-08")))
使用 data.table:
data <- merge(A, B, by = "customer",all.x = TRUE)
library(data.table)
setDT(data)
data[as.Date(application_date.x)+1==application_date.y,flag:=1]
data[as.Date(application_date.x)+1!=application_date.y,flag:=0]
data <- data[,.(id=customer, date=application_date.x,date1=application_date.y,flag)]
data
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.