[英]R: Merging data.frames with multiple conditions & multiple logical operators
今天是個好日子,
我遇到了一個具有挑戰性的問題,我想找到一種優雅的方法來:
結合兩個data.frames:
一個。 兩個共同變量; 和
灣。 一個日期變量,即如果 DATE >= START_DATE & DATE <= END_DATE; 和
c。 組合代碼/ID 變量,即如果 CODE_X == CODE_ID | CODE_X == ID
這是data.frame 1:
CODE_ID = c("A01", "A10", "E01", "C01", "T01")
ID = c("A", "A", "E", "C", "T")
DATE = c("2008-07-01", "2008-07-01", "2009-08-01", "2008-09-01", "2009-10-01")
TF_1 = c("F", "F", "F", "F", "F")
D_VAR_1 = c("D_0101", "D_0101", "D_0101", "D_0101", "D_0102")
DF1 = data.frame(CODE_ID, ID, DATE, TF_1, D_VAR_1)
這是data.frame 2:
CODE_X = c("A", "A10", "E", "C", "T01")
START_DATE = c("2008-07-01", "2009-07-01", "2009-07-01", "2008-07-01", "2009-07-01")
END_DATE= c("2009-06-30", "2010-06-30", "2010-06-30", "2009-06-30", "2010-06-30")
TF_2 = c("F", "F", "F", "F", "F")
D_VAR_2 = c("D_0101", "D_0102", "D_0101", "D_0101", "D_0102")
NAME = c("ACCIDENT", "MISC ACCIDENT", "ENERGY", "CONSTRUCTION", "POLITICS")
DF2 = data.frame(CODE_X, START_DATE, END_DATE, TF_2, D_VAR_2, NAME)
我的最終 data.frame 3 如下所示:
CODE_ID = c("A01", "A10", "E01", "C01", "T01")
ID = c("A", "A", "E", "C", "T")
DATE = c("2008-07-01", "2008-07-01", "2009-08-01", "2008-09-01", "2009-10-01")
TF_1 = c("F", "F", "F", "F", "F")
D_VAR_1 = c("D_0101", "D_0101", "D_0101", "D_0101", "D_0102")
NAME = c("ACCIDENT", "MISC ACCIDENT", "ENERGY", "CONSTRUCTION", "POLITICS")
DF3 = data.frame(CODE_ID, ID, DATE, TF_1, D_VAR_1, NAME)
試試sqldf
package。 它可以讓您組合數據幀,就像您正在編寫 sql 查詢一樣。 可以幫助處理更復雜的連接。
library(sqldf)
sqldf.Example <- sqldf('select DF1.*, DF2.NAME from DF1 join DF2 on (DF1.CODE_ID = DF2.CODE_X or DF1.ID = DF2.CODE_X) and DF1.DATE between DF2.START_DATE and DF2.END_DATE')
另一個使用來自data.table
的非 equi 更新連接的選項:
library(data.table) #data.table_v1.12.4
setDT(DF1)
setDT(DF2)
DF1[DF2, on=.(CODE_ID=CODE_X, DATE>=START_DATE, DATE<=END_DATE), NAME := i.NAME]
DF1[DF2, on=.(ID=CODE_X, DATE>=START_DATE, DATE<=END_DATE),
NAME := fifelse(is.na(x.NAME), i.NAME, x.NAME)]
output:
CODE_ID ID DATE TF_1 D_VAR_1 NAME
1: A01 A 2008-07-01 F D_0101 ACCIDENT
2: A10 A 2008-07-01 F D_0101 ACCIDENT
3: E01 E 2009-08-01 F D_0101 ENERGY
4: C01 C 2008-09-01 F D_0101 CONSTRUCTION
5: T01 T 2009-10-01 F D_0102 POLITICS
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.