[英]Apply Function to Every Row in pandas DataFrame Using Multiple Column Values as Parameters
Given a pandas DataFrame
with multiple columns给定具有多列的 pandas DataFrame
pd.DataFrame({'name': ['Bob', 'Alice'], 'age': [20, 40], 'height': [2.0, 2.1]})
name age height
0 Bob 20 2.0
1 Alice 40 2.1
And a function that takes multiple parameters还有一个带多个参数的 function
def example_hash(name: str, age: int) -> str:
return "In 10 years {} will be {}".format(name, age+10)
How can the DataFrame be updated with an additional column which contains the result of applying a function to a subset of the other columns?如何使用包含将 function 应用于其他列的子集的结果的附加列更新 DataFrame?
The resulting DataFrame would be the result of applying example_hash
to the name
& age
columns:生成的 DataFrame 是将example_hash
应用于name
和age
列的结果:
name age height hash
0 Bob 20 2.0 In 10 years Bob would be 30
1 Alice 40 2.1 In 10 years Alice will be 50
I'm interested in a pandas
centric response.我对以pandas
为中心的响应感兴趣。 I understand that it's possible to construct a python list
, iterate over the rows, and append to the list which would eventually become the column.我知道可以构造一个 python list
,遍历行,然后 append 到最终将成为列的列表。
Thank you in advance for your consideration and response.预先感谢您的考虑和回复。
you can use apply
function to iterate over the rows and add a new column.您可以使用apply
function 遍历行并添加新列。
In [139]: df = pd.DataFrame({'name': ['Bob', 'Alice'], 'age': [20, 40], 'height': [2.0, 2.1]})
In [140]: df
Out[140]:
name age height
0 Bob 20 2.0
1 Alice 40 2.1
In [142]: def example_hash(row):
...: row['hash']= "In 10 years {} will be {}".format(row['name'], row['age']+10)
...: return row
...:
In [143]: df = df.apply(example_hash,axis=1)
In [144]: df
Out[144]:
name age height hash
0 Bob 20 2.0 In 10 years Bob will be 30
1 Alice 40 2.1 In 10 years Alice will be 50
You can do this without changing your example_hash()
method:您可以在不更改example_hash()
方法的情况下执行此操作:
np.vectorize
只需使用np.vectorize
In [2204]: import numpy as np
In [2200]: def example_hash(name: str, age: int) -> str:
...: return "In 10 years {} will be {}".format(name, age+10)
...:
In [2202]: df['new'] = np.vectorize(example_hash)(df['name'], df['age'])
In [2203]: df
Out[2203]:
name age height new
0 Bob 20 2.0 In 10 years Bob will be 30
1 Alice 40 2.1 In 10 years Alice will be 50
df.apply
with lambda
like this without changing your custom method:或者像这样使用df.apply
和lambda
而不更改您的自定义方法:In [2207]: df['new'] = df.apply(lambda x: example_hash(x['name'], x['age']), axis=1)
In [2208]: df
Out[2208]:
name age height new
0 Bob 20 2.0 In 10 years Bob will be 30
1 Alice 40 2.1 In 10 years Alice will be 50
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.