[英]Combine two time-series with different element of the datetime index in Python
I have two time-series
below.我在下面有两个time-series
。 df1
has an index in a DateTime
format which includes only date without time. df1
有一个DateTime
格式的索引,它只包含日期而不包含时间。 df2
has a full datetime index, also in a DateTime
format. df2
具有完整的日期时间索引,也是DateTime
格式。 In the full data, df1
is much shorter than df2
in terms of the number of rows.在全数据中, df1
在行数方面比df2
短很多。
As you can see, both datasets span from the 2nd to the 6th of April.如您所见,这两个数据集的时间跨度为 4 月 2 日至 6 日。 df1, however, skips some dates, while in df2, all days are available.但是,df1 会跳过某些日期,而在 df2 中,所有日期都可用。 Note: in this example, only odd dates are skipped, but it is not the case in the full data.注意:在此示例中,仅跳过奇数日期,但在完整数据中并非如此。
df1 df1
value1
date
2016-04-02 16
2016-04-04 76
2016-04-06 23
df2 df2
value2
DateTime
2016-04-02 07:45:00 257.96
2016-04-02 07:50:00 317.58
2016-04-02 07:55:00 333.39
2016-04-03 08:15:00 449.96
2016-04-03 08:20:00 466.42
2016-04-03 08:25:00 498.56
2016-04-04 08:10:00 454.73
2016-04-04 08:15:00 472.45
2016-04-04 08:20:00 489.85
2016-04-05 07:30:00 169.54
2016-04-05 07:35:00 276.13
2016-04-05 07:40:00 293.70
2016-04-06 07:10:00 108.05
2016-04-06 07:15:00 179.21
2016-04-06 07:20:00 201.80
I want to combine the two datasets by index.我想按索引组合两个数据集。 df1 should controls which dates to be kept. df1 应该控制要保留的日期。 The expected result is below.预期结果如下。
value2 value1
DateTime
2016-04-02 07:45:00 257.96 16
2016-04-02 07:50:00 317.58 16
2016-04-02 07:55:00 333.39 16
2016-04-04 08:10:00 454.73 76
2016-04-04 08:15:00 472.45 76
2016-04-04 08:20:00 489.85 76
2016-04-06 07:10:00 108.05 23
2016-04-06 07:15:00 179.21 23
2016-04-06 07:20:00 201.80 23
This is my attempt.这是我的尝试。
result= pd.concat([df1, df1], axis=1, sort=True).dropna(how='all')
But the result is different to what I expect.但是结果跟我预想的不一样。
Here is possible create new helper column filled by datetimes without times with DatetimeIndex.normalize
:这里可以使用DatetimeIndex.normalize
创建由日期时间填充的新辅助列,而没有DatetimeIndex.normalize
:
df2['date'] = df2.index.normalize()
Or if dates use DatetimeIndex.date
:或者如果日期使用DatetimeIndex.date
:
df2['date'] = df2.index.date
And then use merge
with default inner join:然后使用带有默认内部连接的merge
:
result= df1.merge(df2, left_index=True, right_on='date')
print (result)
value1 value2 date
DateTime
2016-04-02 07:45:00 16 257.96 2016-04-02
2016-04-02 07:50:00 16 317.58 2016-04-02
2016-04-02 07:55:00 16 333.39 2016-04-02
2016-04-04 08:10:00 76 454.73 2016-04-04
2016-04-04 08:15:00 76 472.45 2016-04-04
2016-04-04 08:20:00 76 489.85 2016-04-04
2016-04-06 07:10:00 23 108.05 2016-04-06
2016-04-06 07:15:00 23 179.21 2016-04-06
2016-04-06 07:20:00 23 201.80 2016-04-06
Or use merge_asof
, but it merging by previous match values, so working same like above only if always match datetimes without times from df2
with date
s from df1
:或者使用merge_asof
,但它通过先前的匹配值合并,因此只有在始终匹配日期时间而不是来自df2
的date
与来自df1
date
s 时,才能像上面一样工作:
result= pd.merge_asof(df2, df1, left_index=True, right_index=True)
print (result)
value2 value1
DateTime
2016-04-02 07:45:00 257.96 16
2016-04-02 07:50:00 317.58 16
2016-04-02 07:55:00 333.39 16
2016-04-03 08:15:00 449.96 16
2016-04-03 08:20:00 466.42 16
2016-04-03 08:25:00 498.56 16
2016-04-04 08:10:00 454.73 76
2016-04-04 08:15:00 472.45 76
2016-04-04 08:20:00 489.85 76
2016-04-05 07:30:00 169.54 76
2016-04-05 07:35:00 276.13 76
2016-04-05 07:40:00 293.70 76
2016-04-06 07:10:00 108.05 23
2016-04-06 07:15:00 179.21 23
2016-04-06 07:20:00 201.80 23
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.