简体   繁体   English

将 function 应用于 dataframe 列

[英]apply function to dataframe column

I have data frame x,我有数据框 x,

Please view the x dataframe here请在此处查看 x dataframe

We want to create new column using below function, which will add Complete columns value in start and create new column finish.我们想使用下面的 function 创建新列,这将在开始时添加完整列值并创建新列完成。

import datetime
def date_by_adding_business_days(from_date, add_days):
    business_days_to_add = add_days
    current_date = from_date
    while business_days_to_add > 0:
        current_date += datetime.timedelta(days=1)
        weekday = current_date.weekday()
        if weekday >= 5: # sunday = 6
            continue
        business_days_to_add -= 1
    return current_date

I have tried this getting below error, please help.我试过这个低于错误,请帮助。

x['Finish'] = x.apply(date_by_adding_business_days(datetime.date.today(), x['Complete']))

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Try to refactor your code.尝试重构您的代码。 If you apply function only to one column, then you do it wrong.如果您仅将 function 应用于一列,那么您做错了。 Additionally, for some reason you trying to call the function passing time to it.此外,由于某种原因,您尝试调用 function 传递时间给它。 Why if you can just get it right in the function:为什么如果你能在 function 中得到它:

import datetime
def date_by_adding_business_days(add_days):
    business_days_to_add = add_days
    current_date = datetime.date.today()
    while business_days_to_add > 0:
        current_date += datetime.timedelta(days=1)
        weekday = current_date.weekday()
        if weekday >= 5: # sunday = 6
            continue
        business_days_to_add -= 1
    return current_date

x['Finish'] = x['Complete'].apply(date_by_adding_business_days)

Your approach is correct.你的方法是正确的。 You just need to pass the function reference though.不过,您只需要通过 function 参考。

When you call apply.当你打电话申请时。 It will pass the dataframe row to the function and call it.它将 dataframe 行传递给 function 并调用它。

You can compute variables like the date today within the function itself您可以在 function 本身中计算像今天这样的变量

def date_by_adding_business_days(row):
    add_days = row['Complete']
    from_date = datetime.date.today()

    business_days_to_add = add_days
    current_date = from_date
    while business_days_to_add > 0:
        current_date += datetime.timedelta(days=1)
        weekday = current_date.weekday()
        if weekday >= 5:  # sunday = 6
            continue
        business_days_to_add -= 1
    return current_date

x['Finish'] = x.apply(date_by_adding_business_days, axis=1)

Although both approaches above serve the purpose, I think second option is faster where apply() is done against rows of a single column.尽管上述两种方法都可以达到目的,但我认为第二个选项在 apply() 对单列的行执行时更快。 If you time these solutions in a notebook against the sample dataframe provided in the question, the difference seems obvious.如果您在笔记本中针对问题中提供的示例 dataframe 对这些解决方案进行计时,差异似乎很明显。

Attached is a screenshot.附上截图。 计时 apply() 函数

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM