将 Function 应用于 pandas DataFrame 中的每一行，使用多个列值作为参数

Question

Given a pandas DataFrame with multiple columns给定具有多列的 pandas DataFrame

pd.DataFrame({'name': ['Bob', 'Alice'], 'age': [20, 40], 'height': [2.0, 2.1]})

    name  age  height
0    Bob   20     2.0
1  Alice   40     2.1

And a function that takes multiple parameters还有一个带多个参数的 function

def example_hash(name: str, age: int) -> str:
    return "In 10 years {} will be {}".format(name, age+10)

How can the DataFrame be updated with an additional column which contains the result of applying a function to a subset of the other columns?如何使用包含将 function 应用于其他列的子集的结果的附加列更新 DataFrame？

The resulting DataFrame would be the result of applying example_hash to the name & age columns:生成的 DataFrame 是将example_hash应用于name和age列的结果：

    name  age  height                            hash
0    Bob   20     2.0     In 10 years Bob would be 30
1  Alice   40     2.1    In 10 years Alice will be 50

I'm interested in a pandas centric response.我对以pandas为中心的响应感兴趣。 I understand that it's possible to construct a python list , iterate over the rows, and append to the list which would eventually become the column.我知道可以构造一个 python list ，遍历行，然后 append 到最终将成为列的列表。

Thank you in advance for your consideration and response.预先感谢您的考虑和回复。

Answer 1

you can use apply function to iterate over the rows and add a new column.您可以使用apply function 遍历行并添加新列。

In [139]: df = pd.DataFrame({'name': ['Bob', 'Alice'], 'age': [20, 40], 'height': [2.0, 2.1]})

In [140]: df
Out[140]:
    name  age  height
0    Bob   20     2.0
1  Alice   40     2.1


In [142]: def example_hash(row):
     ...:     row['hash']= "In 10 years {} will be {}".format(row['name'], row['age']+10)
     ...:     return row
     ...:

In [143]: df = df.apply(example_hash,axis=1)

In [144]: df
Out[144]:
    name  age  height                          hash
0    Bob   20     2.0    In 10 years Bob will be 30
1  Alice   40     2.1  In 10 years Alice will be 50

Answer 2

You can do this without changing your example_hash() method:您可以在不更改example_hash()方法的情况下执行此操作：

Just use `np.vectorize`只需使用`np.vectorize`

In [2204]: import numpy as np 

In [2200]: def example_hash(name: str, age: int) -> str: 
      ...:     return "In 10 years {} will be {}".format(name, age+10) 
      ...:                                    
In [2202]: df['new'] = np.vectorize(example_hash)(df['name'], df['age'])                                                                                                                                    

In [2203]: df                                                                                                                                                                                               
Out[2203]: 
    name  age  height                           new
0    Bob   20     2.0    In 10 years Bob will be 30
1  Alice   40     2.1  In 10 years Alice will be 50

OR use `df.apply` with `lambda` like this without changing your custom method:或者像这样使用`df.apply`和`lambda`而不更改您的自定义方法：

In [2207]: df['new'] = df.apply(lambda x: example_hash(x['name'], x['age']), axis=1)                                                                                                                        

In [2208]: df                                                                                                                                                                                               
Out[2208]: 
    name  age  height                           new
0    Bob   20     2.0    In 10 years Bob will be 30
1  Alice   40     2.1  In 10 years Alice will be 50

将 Function 应用于 pandas DataFrame 中的每一行，使用多个列值作为参数

问题描述

2 个解决方案

解决方案1
2 2020-04-22 11:25:23

解决方案2
2 已采纳 2020-04-22 11:26:18

Just use `np.vectorize`只需使用`np.vectorize`

OR use `df.apply` with `lambda` like this without changing your custom method:或者像这样使用`df.apply`和`lambda`而不更改您的自定义方法：

将 Function 应用于 pandas DataFrame 中的每一行，使用多个列值作为参数

问题描述

2 个解决方案

解决方案1 2 2020-04-22 11:25:23

解决方案2 2 已采纳 2020-04-22 11:26:18

Just use np.vectorize只需使用np.vectorize

OR use df.apply with lambda like this without changing your custom method:或者像这样使用df.apply和lambda而不更改您的自定义方法：

解决方案1
2 2020-04-22 11:25:23

解决方案2
2 已采纳 2020-04-22 11:26:18

Just use `np.vectorize`只需使用`np.vectorize`

OR use `df.apply` with `lambda` like this without changing your custom method:或者像这样使用`df.apply`和`lambda`而不更改您的自定义方法：