[英]Key Error resulting from apply and lambda function on DataFrame column
I generated a new column displaying the 7th business day of the year and month from this df:我从这个 df 生成了一个新列,显示一年中的第 7 个工作日和月份:
YearOfSRC MonthNumberOfSRC
0 2022 3
1 2022 4
2 2022 5
3 2022 6
4 2021 4
... ... ... ...
20528 2022 1
20529 2022 2
20530 2022 3
20531 2022 4
20532 2022 5
With this code:使用此代码:
df['PredictionDate'] = (pd
.to_datetime(df[['YearOfSRC', 'MonthNumberOfSRC']]
.set_axis(['year' ,'month'], axis=1)
.assign(day=1)
)
.sub(pd.offsets.BusinessDay(1))
.add(pd.offsets.BusinessDay(7))
)
To output this dataframe ( df_final
) with the new column, PredictionDate
:到 output 这个 dataframe ( df_final
) 与新列PredictionDate
:
YearOfSRC MonthNumberOfSRC PredictionDate
0 2022 3 2022-03-09
1 2022 4 2022-04-11
2 2022 5 2022-05-10
3 2022 6 2022-06-09
4 2021 4 2021-04-09
... ... ... ...
20528 2022 1 2022-01-11
20529 2022 2 2022-02-09
20530 2022 3 2022-03-09
20531 2022 4 2022-04-11
20532 2022 5 2022-05-10
(More details here ) (更多细节在这里)
However, I would like to make use of CustomBusinessDay
and Python's Holiday
package to modify the rows of PredictionDate
where a holiday in the first week would push back the 7th business day by 1 business day.但是,我想利用CustomBusinessDay
和 Python 的Holiday
package 来修改PredictionDate
的行,其中第一周的假期会将第 7 个工作日推迟 1 个工作日。 I know that CustomBusinessDay
has a parameter for a holiday list so in a modular solution I would assign the list from the holiday
library to that parameter.我知道CustomBusinessDay
有一个假期列表的参数,因此在模块化解决方案中,我会将holiday
库中的列表分配给该参数。 I know I could hard-code the added business day by increasing the day by 1 for all months where there is a holiday in the first week, but I would prefer a solution that is more dynamic.我知道我可以通过将第一周有假期的所有月份的天数增加 1 来对添加的工作日进行硬编码,但我更喜欢更具动态性的解决方案。 I have tried this instead of the above code but I get a KeyError:我已经尝试过这个而不是上面的代码,但我得到了一个 KeyError:
df_final['PredictionDate'] = (pd
.to_datetime(df_final[['YearOfSRC', 'MonthNumberOfSRC']]
.set_axis(['year' ,'month'], axis=1)
.assign(day=1)
)
.sub(df_final.apply(lambda x : pd.offsets.CustomBusinessDay(1, holidays = holidays.US(years = x['YearOfSRC']).keys())))
.add(df_final.apply(lambda x: pd.offsets.CustomBusinessDay(7, holidays = holidays.US(years = x['YearOfSRC']).keys())))
)
KeyError: 'YearOfSRC'
I'm sure I am implementing pandas apply
and lambda
functions incorrectly here, but I don't know why the error would be a key error when that's clearly a column in df_final
.我确定我在这里实现了 pandas apply
和lambda
功能不正确,但我不知道为什么错误会是一个关键错误,因为这显然是df_final
中的一列。
EDIT I am still looking for a solution to this problem, so slapped a 100-point bounty on the question.编辑我仍在寻找解决此问题的方法,因此对这个问题给予了 100 分的赏金。 Please give it a go!请试一试!
Per my comment above, try this:根据我上面的评论,试试这个:
df_final['PredictionDate'] = (pd
.to_datetime(df_final[['YearOfSRC', 'MonthNumberOfSRC']]
.set_axis(['year' ,'month'], axis=1)
.assign(day=1)
)
.sub(df_final.apply(lambda x : pd.offsets.CustomBusinessDay(1, holidays = holidays.US(year = x['YearOfSRC']).items()), axis=1))
.add(df_final.apply(lambda x: pd.offsets.CustomBusinessDay(7, holidays = holidays.US(year = x['YearOfSRC']).items()), axis=1))
)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.