![](/img/trans.png)
[英]Merge dataframes by column, if not all columns are present in all data frames in R
[英]R how to merge 2 data frames on 3 columns all with different column names
我正在嘗試對同一列進行內部連接合並兩個不同名稱的不同數據集。 我需要合並三列。 我檢查了 stackoverflow 和其他來源,但問題合並在一列的同名數據源上。
當前代碼:
state <- c('AZ','MD','NY', 'CA', 'FL')
STATE_ID <- c('AZ','MD','NY', 'CA', 'FL')
month <- c(1,2,3,4,5,6,7,8,9,10,11,12)
MONTH_ID <- c(1,2,3,4,5,6,7,8,9,10,11,12)
year <- c(2001, 2002, 2003, 2004)
YEAR_ID <- c(2001, 2002, 2003, 2004)
# note all rates are fake numbers
eduRate <- (7.5, 6.2, 1.3, 9.9, ....)
otherCol <- c('a','b','c','d','e','f','g' ....)
DROPOUT_RATE <- c(1.2, 3.2, 5.3, 1.9, ....)
someOtherCol <- c('a','b','c','d','e','f','g' ....)
anotherCol <- c('a','b','c','d','e','f','g' ....)
data1 <- data.frame(state, month, year, eduRate, otherCol)
data2 <- data.frame(STATE_ID, MONTH_ID, YEAR_ID, DROPOUT_RATE, someOtherCol, anotherCol)
mergeDf <- merge(x=data1, y=data2,
by.x=state, by.y=STATE_ID,
by.x=month, by.y=MONTH_ID,
by.x=year, by.y=YEAR_ID) # <-- NOT WORKING
mergeDf(x=data1, y=data2, by=c("state","year","month")) # <-- cannot use because column names per data set different
所需的 output(不需要額外的列
#merge on state, month and year to get both edu and dropout rates
state, month, year, eduRate, DROPOUT_RATE
AZ 1 2001 7.5 1.2
AZ 2 2002 9.2 3.2
AZ 3 2003 1.3 1.2
...
AL 1 2001 2.5 1.9
AL 2 2002 5.2 1.7
AL 3 2003 4.3 3.4
...
WY 1 2001 2.5 1.9
WY 2 2002 5.2 1.7
WY 3 2003 4.3 3.4
感謝您提前提供任何幫助。
我喜歡從 tidyverse 加入 dplyr 中多個鍵的語法。
state <- c('AZ','MD','NY', 'CA')
STATE_ID <- c('AZ','MD','NY', 'CA')
month <- c(1,2,3,4)
MONTH_ID <- c(1,2,3,4)
year <- c(2001, 2002, 2003, 2004)
YEAR_ID <- c(2001, 2002, 2003, 2004)
# note all rates are fake numbers
eduRate <- c(7.5, 6.2, 1.3, 9.9)
otherCol <- c('a','b','c','d')
DROPOUT_RATE <- c(1.2, 3.2, 5.3, 1.9)
someOtherCol <- c('a','b','c','d')
anotherCol <- c('a','b','c','d')
data1 <- data.frame(state, month, year, eduRate, otherCol)
data2 <- data.frame(STATE_ID, MONTH_ID, YEAR_ID, DROPOUT_RATE, someOtherCol, anotherCol)
df<- data1 %>%
left_join(data2, by=c("state"="STATE_ID","month"="MONTH_ID", "year"="YEAR_ID"))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.