简体   繁体   中英

Find maximum of column for each business quarter pandas

Assume that I have the following data set

import pandas as pd, numpy, datetime

start, end = datetime.datetime(2015, 1, 1), datetime.datetime(2015, 12, 31)
date_list = pd.date_range(start, end, freq='B')
numdays = len(date_list) 

value = numpy.random.normal(loc=1e3, scale=50, size=numdays)
ids = numpy.repeat([1], numdays)

test_df = pd.DataFrame({'Id': ids,
               'Date': date_list,
               'Value': value})

I would now like to calculate the maximum within each business quarter for test_df . One possiblity is to use resample using rule='BQ', how='max' . However, I'd like to keep the structure of the array and just generate another column with the maximum for each BQ, have you guys got any suggestions on how to do this?

I think the following should work for you, this groups on the quarter and calls transform on the 'Value' column and returns the maximum value as a Series with it's index aligned to the original df:

In [26]:
test_df['max'] = test_df.groupby(test_df['Date'].dt.quarter)['Value'].transform('max')
test_df
Out[26]:
          Date  Id        Value          max
0   2015-01-01   1  1005.498555  1100.197059
1   2015-01-02   1  1032.235987  1100.197059
2   2015-01-05   1   986.906171  1100.197059
3   2015-01-06   1   984.473338  1100.197059
........
256 2015-12-25   1   997.965285  1145.215837
257 2015-12-28   1   929.652812  1145.215837
258 2015-12-29   1  1086.128017  1145.215837
259 2015-12-30   1   921.663949  1145.215837
260 2015-12-31   1   938.189566  1145.215837

[261 rows x 4 columns]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM