[英]Reindexing MultiIndex pivot table in Pandas
I would like to reindex my pivot table, to get a chain of daily indexes.我想重新索引我的 pivot 表,以获得每日索引链。 Currently the index looks like this:目前该索引如下所示:
You can see level one is monthly periodicity from the beginning for some series and daily for other.您可以看到,对于某些系列,第一级是从一开始的每月周期性,而对于其他系列,则是每天。
MultiIndex([('1919-01-31', 'PX_LAST', 'M', '2099-12-31'),
('1919-02-28', 'PX_LAST', 'M', '2099-12-31'),
('1919-03-31', 'PX_LAST', 'M', '2099-12-31'),
('1919-04-30', 'PX_LAST', 'M', '2099-12-31'),
('1919-05-31', 'PX_LAST', 'M', '2099-12-31'),
('1919-06-30', 'PX_LAST', 'M', '2099-12-31'),
('1919-07-31', 'PX_LAST', 'M', '2099-12-31'),
('1919-08-31', 'PX_LAST', 'M', '2099-12-31'),
('1919-09-30', 'PX_LAST', 'M', '2099-12-31'),
('1919-10-31', 'PX_LAST', 'M', '2099-12-31'),
...
('2020-06-02', 'PX_LAST', 'D', '2099-12-31'),
('2020-06-03', 'PX_LAST', 'D', '2099-12-31'),
('2020-06-04', 'PX_LAST', 'D', '2099-12-31'),
('2020-06-05', 'PX_LAST', 'D', '2099-12-31'),
('2020-06-06', 'PX_LAST', 'D', '2099-12-31'),
('2020-06-07', 'PX_LAST', 'D', '2099-12-31'),
('2020-06-08', 'PX_LAST', 'D', '2099-12-31'),
('2020-06-08', 'PX_LAST', 'W', '2099-12-31'),
('2020-06-09', 'PX_LAST', 'D', '2099-12-31'),
('2020-06-30', 'PX_LAST', 'M', '2099-12-31')],
names=['date', 'type', 'frequency', 'expiration_date'], length=42368)
I'm getting a start & end date for my daily index, like this (piv_table is my pivot table):我正在获取每日索引的开始和结束日期,如下所示(piv_table 是我的 pivot 表):
start_date = piv_table.index.min()
end_date = piv_table.index.max()
Having that I need to create a list of daily datetime objects, like this:有了这个,我需要创建一个每日日期时间对象列表,如下所示:
new_dates = pd.date_range(start_date[0], end_date[0], freq='D')
Next, I'm reindexing the data:接下来,我正在重新索引数据:
new_pivot = piv_table.reindex(new_dates,level=0).ffill()
But literally nothing happens, my new_pivot table is still the same.但实际上什么也没发生,我的 new_pivot 表还是一样的。 Index has not change to incorporate daily change.指数没有变化以纳入每日变化。 What am I doing wrong?我究竟做错了什么?
Here is my sample data:这是我的示例数据:
date type frequency expiration_date ADP LEVL Index ADS BCI Index
1/31/1919 PX_LAST M 12/31/2099 2 3
2/28/1919 PX_LAST M 12/31/2099
3/31/1919 PX_LAST M 12/31/2099
4/30/1919 PX_LAST M 12/31/2099
5/31/1919 PX_LAST M 12/31/2099
6/30/1919 PX_LAST M 12/31/2099
7/31/1919 PX_LAST M 12/31/2099
8/31/1919 PX_LAST M 12/31/2099
9/30/1919 PX_LAST M 12/31/2099
10/31/1919 PX_LAST M 12/31/2099
11/30/1919 PX_LAST M 12/31/2099
12/31/1919 PX_LAST M 12/31/2099
1/31/1920 PX_LAST M 12/31/2099
2/29/1920 PX_LAST M 12/31/2099
3/31/1920 PX_LAST M 12/31/2099
4/30/1920 PX_LAST M 12/31/2099
5/31/1920 PX_LAST M 12/31/2099
6/30/1920 PX_LAST M 12/31/2099
6/1/2020 PX_LAST D 12/31/2099 23 2342
6/1/2020 PX_LAST W 12/31/2099
6/2/2020 PX_LAST D 12/31/2099
6/3/2020 PX_LAST D 12/31/2099
6/4/2020 PX_LAST D 12/31/2099
6/5/2020 PX_LAST D 12/31/2099
6/6/2020 PX_LAST D 12/31/2099
6/7/2020 PX_LAST D 12/31/2099
6/8/2020 PX_LAST D 12/31/2099
6/8/2020 PX_LAST W 12/31/2099
6/9/2020 PX_LAST D 12/31/2099
6/30/2020 PX_LAST M 12/31/2099
Here's a way to do it:这是一种方法:
min_date = df.reset_index()["date"].min()
max_date = df.reset_index()["date"].max()
all_dates = pd.date_range(min_date, max_date, freq="D")
all_dates.name = "date"
pd.DataFrame(index=all_dates).join(df.reset_index().set_index("date")).sort_index().fillna(method="ffill")
The result is (I don't have values for index, ADS, and BSI):结果是(我没有索引、ADS 和 BSI 的值):
type frequency expiration_date ADP LEVL Index ADS BCI \
date
1919-01-31 PX_LAST M 12/31/2099 2.0 3.0 NaN NaN NaN
1919-02-01 PX_LAST M 12/31/2099 2.0 3.0 NaN NaN NaN
1919-02-02 PX_LAST M 12/31/2099 2.0 3.0 NaN NaN NaN
1919-02-03 PX_LAST M 12/31/2099 2.0 3.0 NaN NaN NaN
1919-02-04 PX_LAST M 12/31/2099 2.0 3.0 NaN NaN NaN
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.