簡體   English   中英

如何合並兩個具有不同結束日期的時間序列數據框並保持較長的結束日期

[英]How to merge two time series dataframes with different end dates and keep the longer end date

我有兩個采樣頻率相同但結束日期不同的時間序列。 我想將它們合並為一個並保留總時間范圍而不是交集。 將數據保留在交集 NaN 之外。

我試過了:

df_to_merge= [df1, df2]
df_merged = reduce(lambda left,right: pd.merge(left,right, on='timestamp'), df_to_merge)

數據:

df1
timestamp          col1
2010-10-10 00:00    10
2010-10-10 00:01    15
...
2010-10-15 00:00    10

df2 
timestamp          col2
2010-10-07 00:00    20
2010-10-10 00:01    25
...
2010-10-18 00:00    20

期望的結果:

timestamp          col1    col2
2010-10-07 00:00    NaN     20
2010-10-07 00:01    NaN     25
...
2010-10-10 00:01    10      30
2010-10-15 00:00    10      40
..
2010-10-18 00:00    NaN     20

您可以執行連接操作:

df_merged = df1.join(df2,how='right')

通過使用right ,您可以確保保留右側(更長的 df)的所有值。

例如:

df1 = pd.DataFrame({'timestamp':pd.to_datetime(pd.Series(['2020-10-10 23:32',
                                                         '2020-10-13 23:28'])),
                  'col1':[5,8]})
df1 = df1.set_index('timestamp').resample('1d').fillna(method='ffill')

            col1
timestamp       
2020-10-10   NaN
2020-10-11   5.0
2020-10-12   5.0
2020-10-13   5.0

df2 = pd.DataFrame({'timestamp':pd.to_datetime(pd.Series(['2020-10-08 23:32',
                                                         '2020-10-15 23:28'])),
                  'col2':[50,80]})
df2 = df2.set_index('timestamp').resample('1d').fillna(method='ffill')

            col1
timestamp       
2020-10-08   NaN
2020-10-09  50.0
2020-10-10  50.0
2020-10-11  50.0
2020-10-12  50.0
2020-10-13  50.0
2020-10-14  50.0
2020-10-15  50.0

返回:

            col1  col2
timestamp             
2020-10-08   NaN   NaN
2020-10-09   NaN  50.0
2020-10-10   NaN  50.0
2020-10-11   5.0  50.0
2020-10-12   5.0  50.0
2020-10-13   5.0  50.0
2020-10-14   NaN  50.0
2020-10-15   NaN  50.0

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM