[英]Trying to combine/merge two data frames in R, matching values in two columns and returning a third
I'm trying to combine two data frames in R, using what I guess would be the equivalent of Excel's VLOOKUP function. 我正在尝试在R中组合两个数据帧,使用我认为相当于Excel的VLOOKUP函数。
In one data frame, I have a list of events that occur in a hockey game (each game represented by a different season and "gcode") -- there are hundreds of rows per game. 在一个数据框中,我有一个曲棍球比赛中发生的事件列表(每个游戏由不同的赛季和“gcode”代表) - 每场比赛有数百行。
I want to add a column that tells me whether the team won or lost. 我想添加一个列,告诉我团队是赢还是输。 I have the results in a different data frame (a list of the results, with one row per game).
我将结果放在不同的数据框中(结果列表,每个游戏一行)。
How can I use "merge()" or a similar function to do this? 如何使用“merge()”或类似功能来执行此操作? I would need the function to reference both the "season" and "gcode" in each data frame.
我需要在每个数据框中引用“季节”和“gcode”的功能。
Here are two example data frames, and the result I want. 这是两个示例数据框,以及我想要的结果。
List of events: 事件清单:
season gcode seconds score_dif
1 20072008 20001 145 2
2 20072008 20001 2055 1
3 20072008 20002 691 0
4 20082009 20053 3528 -1
5 20092010 20104 2787 1
6 20092010 20155 1752 1
7 20102011 20206 2929 0
8 20102011 20257 277 3
9 20102011 20308 2733 -2
10 20132014 20359 3890 -4
List of results: 结果列表:
season gcode result
1 20072008 20001 1
2 20072008 20002 0
3 20072008 20003 1
4 20072008 20004 0
5 20072008 20005 0
6 20072008 20006 0
7 20072008 20007 0
8 20072008 20008 1
9 20072008 20009 0
10 20072008 20010 1
Combined: 联合:
season gcode seconds score_dif result
1 20072008 20001 145 2 1
2 20072008 20001 2055 1 1
3 20072008 20002 691 0 0
4 20082009 20053 3528 -1 0
5 20092010 20104 2787 1 1
6 20092010 20155 1752 1 0
7 20102011 20206 2929 0 0
8 20102011 20257 277 3 0
9 20102011 20308 2733 -2 0
10 20132014 20359 3890 -4 1
Thanks! 谢谢!
Use dplyr
package 使用
dplyr
包
library(dplyr)
df <- events %>%
left_join(results)
If it does not work correctly you can create in both data.frames
a new column join
: 如果它无法正常工作,您可以在两个
data.frames
创建一个新的列join
:
events$join <- paste0(events$season,events$gcode)
results$join <- ...
and then 接着
df <- events %>%
left_join(results, by = "join")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.