简体   繁体   中英

Getting the monthly maximum of a daily dataframe with the corresponding index value

I have dowloaded daily data from yahoo finance

                    Open          High           Low         Close     Volume  \
Date                                                                            
2016-01-04  10485.809570  10485.910156  10248.580078  10283.440430  116249000   
2016-01-05  10373.269531  10384.259766  10173.519531  10310.099609   82348000   
2016-01-06  10288.679688  10288.679688  10094.179688  10214.019531   87751700   
2016-01-07  10144.169922  10145.469727   9810.469727   9979.849609  124188100   
2016-01-08  10010.469727  10122.459961   9849.339844   9849.339844   95672200   
...
2016-02-23   9503.120117   9535.120117   9405.219727   9416.769531   87240700   
2016-02-24   9396.480469   9415.330078   9125.190430   9167.799805   99216000   
2016-02-25   9277.019531   9391.309570   9199.089844   9331.480469          0   
2016-02-26   9454.519531   9576.879883   9436.330078   9513.299805   95662100   
2016-02-29   9424.929688   9498.570312   9332.419922   9495.400391   90978700   

I would like to find the maximum closing price each month and also the date of this closing price.

With a groupby dfM = df['Close'].groupby(df.index.month).max() it returns me the monthly maximums but I am losing the daily index position.

   grouped by month 
1      10310.099609
2       9757.879883

Is there a good way to to keep the index?

I would be looking for a result like this:

            grouped by month 
2016-01-05      10310.099609
2016-02-01       9757.879883

You can get the max value per month using TimeGrouper together with groupby :

from pandas.io.data import DataReader

aapl = DataReader('AAPL', data_source='yahoo', start='2015-6-1')
>>> aapl.groupby(pd.TimeGrouper('M')).Close.max()
Date
2015-06-30    130.539993
2015-07-31    132.070007
2015-08-31    119.720001
2015-09-30    116.410004
2015-10-31    120.529999
2015-11-30    122.570000
2015-12-31    119.029999
2016-01-31    105.349998
2016-02-29     98.120003
2016-03-31    100.529999
Freq: M, Name: Close, dtype: float64

Using idxmax will get the corresponding dates of the max price.

>>> aapl.groupby(pd.TimeGrouper('M')).Close.idxmax()
Date
2015-06-30   2015-06-01
2015-07-31   2015-07-20
2015-08-31   2015-08-10
2015-09-30   2015-09-16
2015-10-31   2015-10-29
2015-11-30   2015-11-03
2015-12-31   2015-12-04
2016-01-31   2016-01-04
2016-02-29   2016-02-17
2016-03-31   2016-03-01
Name: Close, dtype: datetime64[ns]

To get the results side-by-side:

>>> aapl.groupby(pd.TimeGrouper('M')).Close.agg({'max date': 'idxmax', 'max price': np.max})
             max price   max date
Date                             
2015-06-30  130.539993 2015-06-01
2015-07-31  132.070007 2015-07-20
2015-08-31  119.720001 2015-08-10
2015-09-30  116.410004 2015-09-16
2015-10-31  120.529999 2015-10-29
2015-11-30  122.570000 2015-11-03
2015-12-31  119.029999 2015-12-04
2016-01-31  105.349998 2016-01-04
2016-02-29   98.120003 2016-02-17
2016-03-31  100.529999 2016-03-01

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM