简体   繁体   English

将 Function 应用于 pandas DataFrame 中的每一行,使用多个列值作为参数

[英]Apply Function to Every Row in pandas DataFrame Using Multiple Column Values as Parameters

Given a pandas DataFrame with multiple columns给定具有多列的 pandas DataFrame

pd.DataFrame({'name': ['Bob', 'Alice'], 'age': [20, 40], 'height': [2.0, 2.1]})

    name  age  height
0    Bob   20     2.0
1  Alice   40     2.1

And a function that takes multiple parameters还有一个带多个参数的 function

def example_hash(name: str, age: int) -> str:
    return "In 10 years {} will be {}".format(name, age+10)

How can the DataFrame be updated with an additional column which contains the result of applying a function to a subset of the other columns?如何使用包含将 function 应用于其他列的子集的结果的附加列更新 DataFrame?

The resulting DataFrame would be the result of applying example_hash to the name & age columns:生成的 DataFrame 是将example_hash应用于nameage列的结果:

    name  age  height                            hash
0    Bob   20     2.0     In 10 years Bob would be 30
1  Alice   40     2.1    In 10 years Alice will be 50

I'm interested in a pandas centric response.我对以pandas为中心的响应感兴趣。 I understand that it's possible to construct a python list , iterate over the rows, and append to the list which would eventually become the column.我知道可以构造一个 python list ,遍历行,然后 append 到最终将成为列的列表。

Thank you in advance for your consideration and response.预先感谢您的考虑和回复。

you can use apply function to iterate over the rows and add a new column.您可以使用apply function 遍历行并添加新列。

In [139]: df = pd.DataFrame({'name': ['Bob', 'Alice'], 'age': [20, 40], 'height': [2.0, 2.1]})

In [140]: df
Out[140]:
    name  age  height
0    Bob   20     2.0
1  Alice   40     2.1


In [142]: def example_hash(row):
     ...:     row['hash']= "In 10 years {} will be {}".format(row['name'], row['age']+10)
     ...:     return row
     ...:

In [143]: df = df.apply(example_hash,axis=1)

In [144]: df
Out[144]:
    name  age  height                          hash
0    Bob   20     2.0    In 10 years Bob will be 30
1  Alice   40     2.1  In 10 years Alice will be 50

You can do this without changing your example_hash() method:您可以在不更改example_hash()方法的情况下执行此操作:

Just use np.vectorize只需使用np.vectorize

In [2204]: import numpy as np 

In [2200]: def example_hash(name: str, age: int) -> str: 
      ...:     return "In 10 years {} will be {}".format(name, age+10) 
      ...:                                    
In [2202]: df['new'] = np.vectorize(example_hash)(df['name'], df['age'])                                                                                                                                    

In [2203]: df                                                                                                                                                                                               
Out[2203]: 
    name  age  height                           new
0    Bob   20     2.0    In 10 years Bob will be 30
1  Alice   40     2.1  In 10 years Alice will be 50

OR use df.apply with lambda like this without changing your custom method:或者像这样使用df.applylambda而不更改您的自定义方法:

In [2207]: df['new'] = df.apply(lambda x: example_hash(x['name'], x['age']), axis=1)                                                                                                                        

In [2208]: df                                                                                                                                                                                               
Out[2208]: 
    name  age  height                           new
0    Bob   20     2.0    In 10 years Bob will be 30
1  Alice   40     2.1  In 10 years Alice will be 50

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将函数应用于pandas数据框列中每一行的每个单词 - apply function to each word of every row in pandas dataframe column 在熊猫数据框的每一行上应用功能 - Apply function on every row of a pandas dataframe 将函数应用于pandas中数据框的每一列 - Apply a function to every column of a dataframe in pandas 熊猫将功能应用于列的第二行 - Pandas apply function to every second row of column 使用其他行中的值将函数应用于pandas数据帧行 - Apply function to pandas dataframe row using values in other rows 在 Pandas Z6A8064B5DF479455500553C47C50 上使用 apply 和 lambda function Z6A8064B5DF479455500553C47C50 - Using apply and lambda function on Pandas dataframe column that contains missing values 将函数应用于Pandas中数据框列的每一行 - Applying a function to every row in a dataframe column in Pandas 使用应用于数据框中每一列的多个参数的自定义函数 - Custom function using multiple parameters applied to every column in dataframe 如何将function应用于计算中需要多行的pandas dataframe中的每一行? - How to apply function to every row in a pandas dataframe that requires multiple rows in the calculation? 将函数应用于具有不同参数的行pandas数据帧,具体取决于另一列的值 - apply function to rows pandas dataframe with different parameters depending on values of another column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM