简体   繁体   English

Pandas - 基于另一个 df 索引重新采样行

[英]Pandas - resample rows based on another df index

I have a datframe looks like this:我有一个如下所示的数据框:

zone            Datetime        Demand
 48  2020-08-02 00:00:00  14292.550740
 48  2020-08-02 01:00:00  14243.490740
 48  2020-08-02 02:00:00   9130.840744
 48  2020-08-02 03:00:00  10483.510740
 48  2020-08-02 04:00:00  10014.970740

I want to resample (sum) the demand values according to another df index looks like:我想根据另一个 df 索引重新采样(求和)需求值,如下所示:

2020-08-02 03:00:00
2020-08-02 06:00:00
2020-08-02 07:00:00
2020-08-02 10:00:00

What is the best way to handle that?处理这种情况的最佳方法是什么?

I believe you need merge_asof :我相信你需要merge_asof

print (df2)
                     a
2020-08-02 03:00:00  1
2020-08-02 06:00:00  2
2020-08-02 07:00:00  3
2020-08-02 10:00:00  4

df1['Datetime'] = pd.to_datetime(df1['Datetime'])
df2.index = pd.to_datetime(df2.index)

df = pd.merge_asof(df1, 
                   df2.rename_axis('date2').reset_index(), 
                   left_on='Datetime', 
                   right_on='date2', 
                   direction='forward'
                   )
print (df)
   zone            Datetime        Demand               date2  a
0    48 2020-08-02 00:00:00  14292.550740 2020-08-02 03:00:00  1
1    48 2020-08-02 01:00:00  14243.490740 2020-08-02 03:00:00  1
2    48 2020-08-02 02:00:00   9130.840744 2020-08-02 03:00:00  1
3    48 2020-08-02 03:00:00  10483.510740 2020-08-02 03:00:00  1
4    48 2020-08-02 04:00:00  10014.970740 2020-08-02 06:00:00  2

And then aggregate sum , eg if need by both columns:然后聚合sum ,例如,如果两列都需要:

df = df.groupby(['zone','date2'], as_index=False)['Demand'].sum()
print (df)
   zone               date2        Demand
0    48 2020-08-02 03:00:00  48150.392964
1    48 2020-08-02 06:00:00  10014.970740

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM