[英]Pandas apply custom function to each dataframe row and append results
How can I apply a custom function to each row of a Pandas dataframe df1
, where:如何将自定义 function 应用于 Pandas dataframe df1
的每一行,其中:
df1
function 使用df1
中的列中的值df2
function 使用来自另一个 dataframe df2
的值df1
column-wise结果按列附加到df1
Example:例子:
df1 = pd.DataFrame([1, 2, 3], columns=["x"])
df2 = pd.DataFrame({"set1": [0, 0, 0, 0], "set2": [100, 200, 300, 400]})
display(df1, df2)
And custom function和定制function
def myfunc(df2, x=df1["x"]):
# Something simple but custom
ans = df2["set1"] + df2["set2"] * x
return ans
Desired output is所需的 output 是
x X | run1运行1 | run2运行2 | run3运行3 | run4运行4 | |
---|---|---|---|---|---|
0 0 | 1 1 | 100 100 | 200 200 | 300 300 | 400 400 |
1 1 | 2 2 | 200 200 | 400 400 | 600 600 | 800 800 |
2 2 | 3 3 | 300 300 | 600 600 | 900 900 | 1200 1200 |
Here is an example function call;这是一个示例 function 调用; but how can I apply it with a oneliner to get the desired dataframe output?但是如何将它与 oneliner 一起应用以获得所需的 dataframe output?
test = myfunc(df2,x=3)
print(test)
If you really need a custom function, you can use apply
:如果你真的需要一个定制的 function,你可以使用apply
:
# Modified slightly to make using it easier~
def myfunc(x, df2):
return df2["set1"] + df2["set2"] * x
df1 = df1.join(df1.x.apply(myfunc, args=(df2,)).add_prefix('run'))
print(df1)
# Output:
x run0 run1 run2 run3
0 1 100 200 300 400
1 2 200 400 600 800
2 3 300 600 900 1200
That said, there's often a way to do whatever you want to do using pandas methods:也就是说,通常有一种方法可以使用 pandas 方法做任何你想做的事情:
df = df1.merge(df2, 'cross')
df['value'] = df.set1 + df.set2 * df.x
df['run'] = df.groupby('x')['value'].cumcount() + 1
df = df.pivot(index='x', columns='run', values='value')
df.columns = [f'{df.columns.name}{x}' for x in df.columns]
print(df.reset_index())
# Output:
x run1 run2 run3 run4
0 1 100 200 300 400
1 2 200 400 600 800
2 3 300 600 900 1200
You can do你可以做
df1 = df1.join(df1.apply(lambda x : myfunc(df2, x['x']),axis=1))
Out[152]:
x 0 1 2 3
0 1 100 200 300 400
1 2 200 400 600 800
2 3 300 600 900 1200
This is specific to your example myfunc
but it is possible to vectorize with dot
这特定于您的示例myfunc
但可以使用dot
进行矢量化
df1[['x']].dot(
df2['set1'].add(df2['set2']).to_frame().T.values
).rename(
columns={i:f'run{i+1}' for i in df2.index}
).assign(
x = df1['x'],
)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.