简体   繁体   English

将自己的 function 应用于 DataFrame 中的每个项目

[英]Apply own function to every item in DataFrame

I have created a function to give out a rank based on the value in each cell of the table below: Table name is "ranked"我创建了一个 function 来根据下表每个单元格中的值给出排名:表名是“排名”

Date        MMM     AOS     ABT
2016-01-31  55.0    411.0   102.0
2016-02-29  44.0    425.0   96.0
2016-03-31  29.0    410.0   70.0
2016-04-30  29.0    425.0   87.0
2016-05-31  46.0    409.0   52.0

Function: Function:

def get_rank(x):
    if 1 <= x < 96:
        return 1
    elif 96 <= x < 193:
        return 2
    elif 193 <= x < 289:
        return 3
    elif 289 <= x <= 385:
        return 4
    elif x > 385:
        return 5

I have tried to apply the function using lambda:我尝试使用 lambda 应用 function:

ranked.apply(lambda x: get_rank(x))

However it gives me the error message:但是它给了我错误信息:

The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

The end goal is to have a 1 for all values in the table that are below 96, a 2 for all values higher than 192 and smaller than 289.... and so on up to 5.最终目标是让表中所有低于 96 的值都为 1,所有高于 192 且小于 289 的值都为 2……等等,直到 5。

Could you please give me a hint how I can easily apply this function to the table?你能给我一个提示,我可以如何轻松地将这个 function 应用到桌子上吗? Appreciate your help!感谢你的帮助!

Use applymap instead:使用applymap代替:

>>> ranked[['MMM','AOS','ABT']].applymap(get_rank)

Should return the sub-dataframe "MMM, AOS, ABT" resulting from applying your get_rank() function to each value.应返回将get_rank() function 应用于每个值所产生的子数据框“MMM、AOS、ABT”。

I would just add ranked columns to the data frame:我只想将排名列添加到数据框中:

df['Ranked MMM']=[get_rank(i) for i in df['MMM']]

And also add a default return at the end of the function like return 0并且还在 function 的末尾添加一个默认返回,例如 return 0

You can use .applymap to do this.您可以使用.applymap来执行此操作。 You don't have to wrap your function in a lambda since it already takes a single value as an argument.您不必将 function 包装在lambda中,因为它已经采用单个值作为参数。 Since, trying to apply your function on the date would result in an error, you have to specify which columns you want to map on and then replace the original values.由于尝试在日期上应用 function 会导致错误,因此您必须指定要在哪些列上应用 map,然后替换原始值。

apply_on = ["MMM", "AOS", "ABT"]
ranked[apply_on] = ranked[apply_on].applymap(get_rank)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM