简体   繁体   English

重新采样每日数据以获得每月数据框架?

[英]Resample daily data to get monthly dataframe?

I have a daily dataframe which I am trying to resample to get the monthly Open High Low Close . 我有一个每日数据框,我正在尝试resample以获取每月的Open High Low Close

daily_df

            Open   High    Low   Last  Close

Date                                         

2010-01-04  55.15  57.55  54.55  57.50  57.30

2010-01-05  59.70  59.70  57.45  57.90  58.00

2010-01-06  60.30  60.30  57.10  57.55  57.50

2010-01-07  60.25  60.25  57.35  58.85  58.90

2010-01-08  59.40  59.95  56.90  57.30  57.65

2010-01-11  57.30  57.95  56.00  56.25  56.25

2010-01-12  56.25  56.80  53.80  54.25  54.10

2010-01-13  54.00  55.00  52.15  54.90  54.85

2010-01-14  55.45  55.70  54.15  54.30  54.35

2010-01-15  54.60  55.30  54.00  54.30  54.30

2010-01-18  53.90  55.20  53.85  54.35  54.40

2010-01-19  54.60  55.20  53.55  53.65  53.75

2010-01-20  54.40  54.40  53.45  53.60  53.70

2010-01-21  53.85  53.85  51.95  52.10  52.25

2010-01-22  51.80  52.85  50.30  51.85  52.00

2010-01-25  52.50  52.50  50.50  50.70  50.85

2010-01-27  51.25  51.25  47.80  47.90  48.20

2010-01-28  48.55  50.50  47.10  47.45  47.35

2010-01-29  47.45  52.15  45.60  51.80  51.70

2010-02-01  51.80  52.40  50.50  51.50  51.45

2010-02-02  53.25  54.10  51.40  51.80  51.80

2010-02-03  51.60  52.90  51.50  51.85  51.95

I have tried: 我努力了:

df2 = df_daily.resample('M',convention='end').asfreq()

This gives me a dataframe with only the closing values ie 30th values of open high low close if the date is exactly end of month otherwise NaN . 这给了我一个只有收盘价的数据框,即如果日期恰好是月末,则为开高低收盘的第30个值,否则为NaN

df2=df_daily.resample('M').mean()

This results in values which I assume are the average/mean of the Open High Low Close values in a particular month. 这导致得出的值是我假设的是特定月份中开盘价高低价收盘价的平均值/均值。

I am looking to get the Open of the month from the first day of the month where price is available, high to be the highest value during that month, low to be the lowest of the month, Close to the actual close. 我希望从有价格可用的月份的第一天开始获得每月的开盘价,最高价为该月的最高价,最低价为当月的最低价,接近实际收盘价。

I believe I can do this in pandas in a different way using min max but just wondering if resampling can be used to do this. 我相信我可以使用min max以不同的方式在熊猫中做到这一点,但只是想知道是否可以使用重采样来做到这一点。

Expected df 预期df

         Open   High    Low   Close

Date                                         

2010-01-29  55.15  60.3  45.6  51.7

Thanks 谢谢

resample by month considers last day of month irrespective of column dates. 按月resample考虑月份的最后一天,而与列日期无关。

df2 = df_daily.resample('M').agg({'Open':'first', 'High':'max', 
                                      'Low': 'min', 'Close':'last'})

Output: 输出:

            Open    High    Low    Close
Date                
2010-01-31  55.15   60.3    45.6    51.70
2010-02-28  51.80   54.1    50.5    51.95

You can change the index to last day present in the column: 您可以将列中的索引更改为最后一天:

df2 = df_daily.resample('M').agg({'Open':'first', 'High':'max', 
                                      'Low': 'min', 'Close':'last'})

idx = df_daily.reset_index().groupby(df_daily.index.to_period('M'))['Date'].idxmax()
df2.index = df_daily.iloc[idx].index
print(df2)

Output:

            Open    High    Low    Close
Date                
2010-01-29  55.15   60.3    45.6    51.70
2010-02-03  51.80   54.1    50.5    51.95

If you only want to groupby year and month use: 如果只想按年和月groupby ,请使用:

df3 = df_daily.groupby([df_daily.index.year,df_daily.index.month]).agg({'Open':'first',
                         'High':'max', 'Low': 'min', 'Close':'last'})

df3.index.names= ['Year', 'Month']
print(df3)

Output:

                Open    High    Low     Close
Year    Month               
2010      1     55.15   60.3    45.6    51.70
          2     51.80   54.1    50.5    51.95

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM