[英]Specific Join of two Dataframes
I have two data frames: df1
and df2
: 我有两个数据帧: df1
和df2
:
> df1
ID Gender age cd evnt scr test_dt
1 C0004 MALE 22 1 1 82 7/3/2014
2 C0004 MALE 22 1 2 76 7/3/2014
3 C0005 MALE 22 1 3 1514 7/3/2014
4 C0005 MALE 23 2 1 81 11/3/2014
5 C0006 MALE 23 2 2 75 11/3/2014
6 C0006 MALE 23 2 3 878 11/3/2014
and, 和,
> df2
ID hgt wt phys_dt
1 C0004 70 147 6/29/2015
2 C0004 70 157 6/27/2016
3 C0005 67 175 6/27/2016
4 C0005 65 171 7/2/2014
5 C0006 69 160 6/29/2015
6 C0006 64 143 7/2/2014
I want to join df1
and df2
in a way that yields the following data frame, call it df3
: 我想以产生以下数据帧的方式加入df1
和df2
,将其称为df3
:
> df3
ID Gender age cd evnt scr hgt wt
1 C0004 MALE 22 1 1 82 70 147
2 C0004 MALE 22 1 2 76 70 157
3 C0005 MALE 22 1 3 1514 67 175
4 C0005 MALE 23 2 1 81 65 171
5 C0006 MALE 23 2 2 75 69 160
6 C0006 MALE 23 2 3 878 64 143
I'm trying to add df2$hgt
and df2$wt
to the proper ID
row. 我正在尝试将df2$hgt
和df2$wt
到正确的ID
行。 The tricky part is that I want to join hgt
and wt
to the ID
row whose dates ( df1$test_dt
and df2$phys_dt
) most closely align. 棘手的部分是我想将hgt
和wt
加入到日期( df1$test_dt
和df2$phys_dt
)最接近的ID
行中。 I was thinking I could first sort the two data frames by ID
then by their respective dates then try and join? 我想我可以先按ID
对两个数据框进行排序,然后按它们各自的日期排序,然后尝试加入? I'm not quite sure how to approach this. 我不太确定该如何处理。 Thanks. 谢谢。
If you want to murge just matching the df1$ID and df2$ID, the following should do it: 如果您只想匹配df1 $ ID和df2 $ ID,则应该执行以下操作:
df3 <- left_join(df1, df2, by = c("ID" = "ID"))
if the date should be matched as well as the ID, you could try: 如果日期和ID应该匹配,则可以尝试:
df3 <- left_join(df1, df2, by = c("ID" = "ID", "test_dt" = "phys_dt"))
it is in the library(dplyr) 它在库中(dplyr)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.