应用和 lambda function 在 DataFrame 列上导致的键错误

Question

I generated a new column displaying the 7th business day of the year and month from this df:我从这个 df 生成了一个新列，显示一年中的第 7 个工作日和月份：

      YearOfSRC MonthNumberOfSRC    
0       2022             3    
1       2022             4          
2       2022             5             
3       2022             6            
4       2021             4   
... ... ... ...
20528   2022             1             
20529   2022             2             
20530   2022             3            
20531   2022             4             
20532   2022             5

With this code:使用此代码：

df['PredictionDate'] = (pd
 .to_datetime(df[['YearOfSRC', 'MonthNumberOfSRC']]
                .set_axis(['year' ,'month'], axis=1)
                .assign(day=1)
              )
 .sub(pd.offsets.BusinessDay(1))
 .add(pd.offsets.BusinessDay(7))
)

To output this dataframe ( df_final ) with the new column, PredictionDate :到 output 这个 dataframe ( df_final ) 与新列PredictionDate ：

       YearOfSRC    MonthNumberOfSRC    PredictionDate
0       2022             3              2022-03-09
1       2022             4              2022-04-11
2       2022             5              2022-05-10
3       2022             6              2022-06-09
4       2021             4              2021-04-09
... ... ... ...
20528   2022             1              2022-01-11
20529   2022             2              2022-02-09
20530   2022             3              2022-03-09
20531   2022             4              2022-04-11
20532   2022             5              2022-05-10

(More details here ) （更多细节在这里）

However, I would like to make use of CustomBusinessDay and Python's Holiday package to modify the rows of PredictionDate where a holiday in the first week would push back the 7th business day by 1 business day.但是，我想利用CustomBusinessDay和 Python 的Holiday package 来修改PredictionDate的行，其中第一周的假期会将第 7 个工作日推迟 1 个工作日。 I know that CustomBusinessDay has a parameter for a holiday list so in a modular solution I would assign the list from the holiday library to that parameter.我知道CustomBusinessDay有一个假期列表的参数，因此在模块化解决方案中，我会将holiday库中的列表分配给该参数。 I know I could hard-code the added business day by increasing the day by 1 for all months where there is a holiday in the first week, but I would prefer a solution that is more dynamic.我知道我可以通过将第一周有假期的所有月份的天数增加 1 来对添加的工作日进行硬编码，但我更喜欢更具动态性的解决方案。 I have tried this instead of the above code but I get a KeyError:我已经尝试过这个而不是上面的代码，但我得到了一个 KeyError：

df_final['PredictionDate'] = (pd
 .to_datetime(df_final[['YearOfSRC', 'MonthNumberOfSRC']]
                .set_axis(['year' ,'month'], axis=1)
                .assign(day=1)
              )
 .sub(df_final.apply(lambda x : pd.offsets.CustomBusinessDay(1, holidays = holidays.US(years = x['YearOfSRC']).keys())))
 .add(df_final.apply(lambda x: pd.offsets.CustomBusinessDay(7, holidays = holidays.US(years = x['YearOfSRC']).keys())))
)

KeyError: 'YearOfSRC'

I'm sure I am implementing pandas apply and lambda functions incorrectly here, but I don't know why the error would be a key error when that's clearly a column in df_final .我确定我在这里实现了 pandas apply和lambda功能不正确，但我不知道为什么错误会是一个关键错误，因为这显然是df_final中的一列。

EDIT I am still looking for a solution to this problem, so slapped a 100-point bounty on the question.编辑我仍在寻找解决此问题的方法，因此对这个问题给予了 100 分的赏金。 Please give it a go!请试一试！

Answer 1

Per my comment above, try this:根据我上面的评论，试试这个：

df_final['PredictionDate'] = (pd
 .to_datetime(df_final[['YearOfSRC', 'MonthNumberOfSRC']]
                .set_axis(['year' ,'month'], axis=1)
                .assign(day=1)
              )
 .sub(df_final.apply(lambda x : pd.offsets.CustomBusinessDay(1, holidays = holidays.US(year = x['YearOfSRC']).items()), axis=1))
 .add(df_final.apply(lambda x: pd.offsets.CustomBusinessDay(7, holidays = holidays.US(year = x['YearOfSRC']).items()), axis=1))
)

应用和 lambda function 在 DataFrame 列上导致的键错误

问题描述

1 个解决方案

解决方案1
1 已采纳 2022-08-11 21:14:44

应用和 lambda function 在 DataFrame 列上导致的键错误

问题描述

1 个解决方案

解决方案1 1 已采纳 2022-08-11 21:14:44

解决方案1
1 已采纳 2022-08-11 21:14:44