简体   繁体   English

应用和 lambda function 在 DataFrame 列上导致的键错误

[英]Key Error resulting from apply and lambda function on DataFrame column

I generated a new column displaying the 7th business day of the year and month from this df:我从这个 df 生成了一个新列,显示一年中的第 7 个工作日和月份:

      YearOfSRC MonthNumberOfSRC    
0       2022             3    
1       2022             4          
2       2022             5             
3       2022             6            
4       2021             4   
... ... ... ...
20528   2022             1             
20529   2022             2             
20530   2022             3            
20531   2022             4             
20532   2022             5             

With this code:使用此代码:

df['PredictionDate'] = (pd
 .to_datetime(df[['YearOfSRC', 'MonthNumberOfSRC']]
                .set_axis(['year' ,'month'], axis=1)
                .assign(day=1)
              )
 .sub(pd.offsets.BusinessDay(1))
 .add(pd.offsets.BusinessDay(7))
)

To output this dataframe ( df_final ) with the new column, PredictionDate :到 output 这个 dataframe ( df_final ) 与新列PredictionDate

       YearOfSRC    MonthNumberOfSRC    PredictionDate
0       2022             3              2022-03-09
1       2022             4              2022-04-11
2       2022             5              2022-05-10
3       2022             6              2022-06-09
4       2021             4              2021-04-09
... ... ... ...
20528   2022             1              2022-01-11
20529   2022             2              2022-02-09
20530   2022             3              2022-03-09
20531   2022             4              2022-04-11
20532   2022             5              2022-05-10

(More details here ) (更多细节在这里

However, I would like to make use of CustomBusinessDay and Python's Holiday package to modify the rows of PredictionDate where a holiday in the first week would push back the 7th business day by 1 business day.但是,我想利用CustomBusinessDay和 Python 的Holiday package 来修改PredictionDate的行,其中第一周的假期会将第 7 个工作日推迟 1 个工作日。 I know that CustomBusinessDay has a parameter for a holiday list so in a modular solution I would assign the list from the holiday library to that parameter.我知道CustomBusinessDay有一个假期列表的参数,因此在模块化解决方案中,我会将holiday库中的列表分配给该参数。 I know I could hard-code the added business day by increasing the day by 1 for all months where there is a holiday in the first week, but I would prefer a solution that is more dynamic.我知道我可以通过将第一周有假期的所有月份的天数增加 1 来对添加的工作日进行硬编码,但我更喜欢更具动态性的解决方案。 I have tried this instead of the above code but I get a KeyError:我已经尝试过这个而不是上面的代码,但我得到了一个 KeyError:

df_final['PredictionDate'] = (pd
 .to_datetime(df_final[['YearOfSRC', 'MonthNumberOfSRC']]
                .set_axis(['year' ,'month'], axis=1)
                .assign(day=1)
              )
 .sub(df_final.apply(lambda x : pd.offsets.CustomBusinessDay(1, holidays = holidays.US(years = x['YearOfSRC']).keys())))
 .add(df_final.apply(lambda x: pd.offsets.CustomBusinessDay(7, holidays = holidays.US(years = x['YearOfSRC']).keys())))
)

KeyError: 'YearOfSRC'

I'm sure I am implementing pandas apply and lambda functions incorrectly here, but I don't know why the error would be a key error when that's clearly a column in df_final .我确定我在这里实现了 pandas applylambda功能不正确,但我不知道为什么错误会是一个关键错误,因为这显然是df_final中的一列。

EDIT I am still looking for a solution to this problem, so slapped a 100-point bounty on the question.编辑我仍在寻找解决此问题的方法,因此对这个问题给予了 100 分的赏金。 Please give it a go!请试一试!

Per my comment above, try this:根据我上面的评论,试试这个:

df_final['PredictionDate'] = (pd
 .to_datetime(df_final[['YearOfSRC', 'MonthNumberOfSRC']]
                .set_axis(['year' ,'month'], axis=1)
                .assign(day=1)
              )
 .sub(df_final.apply(lambda x : pd.offsets.CustomBusinessDay(1, holidays = holidays.US(year = x['YearOfSRC']).items()), axis=1))
 .add(df_final.apply(lambda x: pd.offsets.CustomBusinessDay(7, holidays = holidays.US(year = x['YearOfSRC']).items()), axis=1))
)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将 lambda 函数正确应用于数据框列? - How to apply lambda function to column of dataframe correctly? 如何将带有 key=lambda 函数的 .sort() 应用于单个列上数据帧的每一行? - How to apply .sort() with a key=lambda function to every row of a dataframe on a single column? 如何将.apply()lambda函数设置为DataFrame中的受限列集 - How to .apply() lambda function to restricted column set in DataFrame Appy/lambda 应用 function 到 dataframe 与其他列中的特定条件 - Appy/lambda apply function to dataframe with specific condition in other column Pandas Dataframe Groupby应用具有多个列返回值的Lambda函数 - Pandas Dataframe Groupby Apply Lambda Function With Multiple Column Returns 在 dataframe 的引用列中返回值,应用/lambda function - Return value in referencing column of dataframe with apply/lambda function 在 Pandas Z6A8064B5DF479455500553C47C50 上使用 apply 和 lambda function Z6A8064B5DF479455500553C47C50 - Using apply and lambda function on Pandas dataframe column that contains missing values lambda function 列在 dataframe 列错误 - lambda function on list in dataframe column error 将 lambda 函数应用于 dask 数据帧 - apply a lambda function to a dask dataframe 尝试使用Pandas Dataframe应用Lambda函数时出错 - Error when trying to apply lambda function using pandas Dataframe
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM