简体   繁体   English

Python,将数据框中的每日数据汇总为每月和每季度

[英]Python, summarize daily data in dataframe to monthly and quarterly

I have already loaded my data into Pandas dataframe.我已经将我的数据加载到 Pandas 数据框中。

Example:示例:

Date        Price
2012/12/02  141.25
2012/12/05  132.64
2012/12/06  132.11
2012/12/21  141.64                                                     
2012/12/25  143.19  
2012/12/31  139.66  
2013/01/05  145.11  
2013/01/06  145.99  
2013/01/07  145.97
2013/01/11  145.11  
2013/01/12  145.99  
2013/01/24  145.97
2013/02/23  145.11  
2013/03/24  145.99  
2013/03/28  145.97
2013/04/28  145.97
2013/05/24  145.97
2013/06/23  145.11  
2013/07/24  145.99  
2013/08/28  145.97
2013/09/28  145.97

Just two columns, one is data and one is price.只有两列,一列是数据,一列是价格。

Now how to group or resample the data starts from 2013 to monthly and quarterly df?现在如何对从 2013 年开始到每月和每季度 df 的数据进行分组或重新采样?

Monthly:每月:

Date        Price
2013/01/01  Monthly total
2013/02/01  Monthly total
2013/03/01  Monthly total
2013/04/01  Monthly total
2013/05/01  Monthly total
2013/06/01  Monthly total
2013/07/01  Monthly total
2013/08/01  Monthly total  
2013/09/01  Monthly total

Quarterly:季刊:

Date        Price
2013/01/01  Quarterly total
2013/04/01  Quarterly total
2013/07/01  Quarterly total

Please note that the monthly and quarterly data need to start from first day of month but in the original dataframe the first day of month data is missing, quantity of valid daily data in each month could vary.请注意,月度和季度数据需要从月的第一天开始,但在原始数据框中缺少月的第一天数据,每个月的有效日数据数量可能会有所不同。 Also the original dataframe has data from 2012 to 2013, I only need monthly and quarterly data from beginning of 2013.另外原始数据框有 2012 年到 2013 年的数据,我只需要 2013 年初的月度和季度数据。

I tried something like我试过类似的东西

result1 = df.groupby([lambda x: x.year, lambda x: x.month], axis=1).sum()

but does not work.但不起作用。

Thank you!谢谢!

First convert your Date column into a datetime index:首先将您的日期列转换为日期时间索引:

df.Date = pd.to_datetime(df.Date)
df.set_index('Date', inplace=True)

Then use resample .然后使用resample The list of offset aliases is in the pandas documentation .偏移别名列表在pandas 文档中 For begin of month resample, use MS , and QS for the quarters:对于月初重新采样,对季度使用MSQS

df.resample('QS').sum()
Out[46]: 
              Price
Date               
2012-10-01   830.49
2013-01-01  1311.21
2013-04-01   437.05
2013-07-01   437.93

df.resample('MS').sum()
Out[47]: 
             Price
Date              
2012-12-01  830.49
2013-01-01  874.14
2013-02-01  145.11
2013-03-01  291.96
2013-04-01  145.97
2013-05-01  145.97
2013-06-01  145.11
2013-07-01  145.99
2013-08-01  145.97
2013-09-01  145.97

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM