简体   繁体   English

Pandas:自上次发生以来经过的天数

[英]Pandas: Number of days elapsed since last occurrence

I saw the question being answered for a single date but not a set of different dates: I would like to create a column that counts the number of days elapsed since the last occurence of an event in Pandas.我看到问题是针对单个日期而不是一组不同的日期回答的:我想创建一个列来计算自 Pandas 中最后一次发生事件以来经过的天数。 I have a dictionary containing similar dataframes with the following structure:我有一个包含具有以下结构的类似数据框的字典:

                Vol  Vol_lag  Fed_meeting
Date                                     
2005-06-02  72.9000  72.5000          0.0
2005-06-10  78.3000  72.9000          0.0
2005-06-16  76.0500  78.3000          0.0
2005-06-17  73.0500  76.0500          0.0
2005-06-24  75.7000  73.0500          0.0
...             ...      ...          ...
2022-01-03  80.3288  77.8832          0.0
2022-01-04  83.1597  80.3288          0.0
2022-01-05  80.5131  83.1597          0.0

This is obtained by iterating through the data frames in my dictionary, like so:这是通过遍历我的字典中的数据框获得的,如下所示:

df = pd.read_excel(file, sheet_name=None, index_col="Date", parse_dates=True)
fed_df = pd.read_excel(fed_file, index_col="Date", parse_dates=True)

for key in df:
    df[key]["Vol_lag"] = df[key]["Vol"].shift(1)
    df[key] = pd.merge(df[key], fed_df, how='outer', left_index=True, right_index=True)
    df[key].fillna(0, inplace=True)

"Fed_meeting" is a column that contains a 1 if there's a Fed meeting on the day, 0 if not. “Fed_meeting”是一列,如果当天有美联储会议,则包含 1,如果没有,则包含 0。 I'd like to add a column "Days_elapsed" in each of the dataframes that counts the number of days elapsed since Fed_meeting was last equal to 1 (ie equal to 0 if today is Fed day, 1 if the meeting was yesterday, and so on).我想在每个数据框中添加一列“Days_elapsed”,用于计算自 Fed_meeting 上次等于 1 以来经过的天数(即,如果今天是美联储日,则等于 0,如果会议是昨天,则等于 1,等等上)。 My data is imported such that the index of the dataframe already has a datetime format.我的数据已导入,因此 dataframe 的索引已经具有日期时间格式。

Edited to add : Unfortunately the interval between dates is not regular (sometimes there's a 1 week gap between data points, but sometimes the gap is daily), so the code has to be based on the actual days elapsed and not just the number of data points between the two meetings.编辑添加:不幸的是,日期之间的间隔不规则(有时数据点之间有 1 周的间隔,但有时间隔是每天),因此代码必须基于实际经过的天数,而不仅仅是数据的数量两次会议之间的点。

Edit 2 : the Fed_meeting column is already the product of a merge between my original dfs and a fed_df containing only Fed meeting dates.编辑 2 : Fed_meeting 列已经是我原来的 dfs 和仅包含 Fed 会议日期的 fed_df 合并的产物。

Thanks a lot!非常感谢!

You can use an asof merge to bring the closest date when there was a Fed meeting (in the past), and then manually calculate the day difference between those dates.您可以使用asof合并来获得最近的美联储会议日期(过去),然后手动计算这些日期之间的天差。 An asof merge guarantees the result is the same length as the left DataFrame. asof合并保证结果与left DataFrame 的长度相同。

Starting Data起始数据

# So there are some Fed_meetings in the actual data
print(df)

                Vol  Vol_lag  Fed_meeting
Date                                     
2005-06-02  72.9000  72.5000          0.0
2005-06-10  78.3000  72.9000          0.0
2005-06-16  76.0500  78.3000          1.0
2005-06-17  73.0500  76.0500          0.0
2005-06-24  75.7000  73.0500          1.0
2022-01-03  80.3288  77.8832          0.0
2022-01-04  83.1597  80.3288          0.0
2022-01-05  80.5131  83.1597          1.0

Code代码

import pandas as pd

meetings = df[df['Fed_meeting'].eq(1)].copy()
meetings['Prev_date'] = meetings.index

df = pd.merge_asof(df, meetings['Prev_date'],
                   left_index=True, right_index=True,
                   direction='backward')

df['Date_diff'] = df.index-df['Prev_date']

print(df)
                Vol  Vol_lag  Fed_meeting  Prev_date Date_diff
Date                                                          
2005-06-02  72.9000  72.5000          0.0        NaT       NaT
2005-06-10  78.3000  72.9000          0.0        NaT       NaT
2005-06-16  76.0500  78.3000          1.0 2005-06-16    0 days
2005-06-17  73.0500  76.0500          0.0 2005-06-16    1 days
2005-06-24  75.7000  73.0500          1.0 2005-06-24    0 days
2022-01-03  80.3288  77.8832          0.0 2005-06-24 6037 days
2022-01-04  83.1597  80.3288          0.0 2005-06-24 6038 days
2022-01-05  80.5131  83.1597          1.0 2022-01-05    0 days

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM