[英]merging irregular time series datset with a regular time series data in R
Hi I am trying to merge two datasets, one with regular time series (dataframe A) and one with irregular time series (dataframe B).您好我正在尝试合并两个数据集,一个具有常规时间序列(数据帧 A),另一个具有不规则时间序列(数据帧 B)。 The two dataframes look something like:
这两个数据框看起来像:
Dataframe A: Dataframe A:
time![]() |
country![]() |
Action![]() |
---|---|---|
198001 ![]() |
A![]() |
1 ![]() |
198002 ![]() |
A![]() |
0 ![]() |
198003 ![]() |
A![]() |
0 ![]() |
198004 ![]() |
A![]() |
-1 ![]() |
... ![]() |
||
201210 ![]() |
Z ![]() |
1 ![]() |
201211 ![]() |
Z ![]() |
0 ![]() |
201212 ![]() |
Z ![]() |
0 ![]() |
Dataframe B: Dataframe B:
time![]() |
country![]() |
party![]() |
variable![]() |
---|---|---|---|
198002 ![]() |
A![]() |
A1 ![]() |
X ![]() |
201210 ![]() |
Z ![]() |
Z1 ![]() |
Y![]() |
201212 ![]() |
Z ![]() |
Z2 ![]() |
Z ![]() |
Ive tried using full_join from dplyr but then it yielded NAs for all timeframes that did not overlap with the observations in dataframe 2.我尝试使用 dplyr 中的 full_join,但随后它为所有与 dataframe 2 中的观察结果不重叠的时间范围产生了 NA。
What I have now looks like:我现在的样子:
time![]() |
country![]() |
Action![]() |
Party![]() |
Variable![]() |
---|---|---|---|---|
198001 ![]() |
A![]() |
1 ![]() |
NA![]() |
NA![]() |
198002 ![]() |
A![]() |
0 ![]() |
A1 ![]() |
X ![]() |
198003 ![]() |
A![]() |
0 ![]() |
A1 ![]() |
NA![]() |
198004 ![]() |
A![]() |
-1 ![]() |
A1 ![]() |
NA![]() |
... ![]() |
||||
201210 ![]() |
Z ![]() |
1 ![]() |
Z1 ![]() |
Y![]() |
201211 ![]() |
Z ![]() |
0 ![]() |
Z1 ![]() |
NA![]() |
201212 ![]() |
Z ![]() |
0 ![]() |
Z2 ![]() |
Z ![]() |
Instead, I would want the observations in the non-overlapping timeframe (NAs) to be replaced with the last observation in dataframe 2.相反,我希望将非重叠时间范围 (NA) 中的观察结果替换为 dataframe 2 中的最后一个观察结果。
So the merged dataframe would look like:所以合并的 dataframe 看起来像:
time![]() |
country![]() |
Action![]() |
Party![]() |
Variable![]() |
---|---|---|---|---|
198001 ![]() |
A![]() |
1 ![]() |
NA![]() |
NA![]() |
198002 ![]() |
A![]() |
0 ![]() |
A1 ![]() |
X ![]() |
198003 ![]() |
A![]() |
0 ![]() |
A1 ![]() |
X ![]() |
198004 ![]() |
A![]() |
-1 ![]() |
A1 ![]() |
X ![]() |
... ![]() |
||||
201210 ![]() |
Z ![]() |
1 ![]() |
Z1 ![]() |
Y![]() |
201211 ![]() |
Z ![]() |
0 ![]() |
Z1 ![]() |
Y![]() |
201212 ![]() |
Z ![]() |
0 ![]() |
Z2 ![]() |
Z ![]() |
You can do a full_join
and then fill
party
and variable
columns.您可以执行
full_join
,然后fill
party
和variable
列。
result <- dplyr::full_join(A, B, by = c("time", "country")) %>%
tidyr::fill(party, variable, .direction = 'down')
result
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.