Pandas 将自定义 function 应用到每个 dataframe 结果和 Z9516DFB15F51C7EE19A4D46DZC0

Question

How can I apply a custom function to each row of a Pandas dataframe df1 , where:如何将自定义 function 应用于 Pandas dataframe df1的每一行，其中：

the function uses values from a column in df1 function 使用df1中的列中的值
the function uses values from another dataframe df2 function 使用来自另一个 dataframe df2的值
the results are appended to df1 column-wise结果按列附加到df1

Example:例子：

df1 = pd.DataFrame([1, 2, 3], columns=["x"])

df2 = pd.DataFrame({"set1": [0, 0, 0, 0], "set2": [100, 200, 300, 400]})

display(df1, df2)

And custom function和定制function

def myfunc(df2, x=df1["x"]):
    # Something simple but custom
    ans = df2["set1"] + df2["set2"] * x
    return ans

Desired output is所需的 output 是

	x X	run1运行1	run2运行2	run3运行3	run4运行4
0 0	1 1	100 100	200 200	300 300	400 400
1 1	2 2	200 200	400 400	600 600	800 800
2 2	3 3	300 300	600 600	900 900	1200 1200

Here is an example function call;这是一个示例 function 调用； but how can I apply it with a oneliner to get the desired dataframe output?但是如何将它与 oneliner 一起应用以获得所需的 dataframe output？

test = myfunc(df2,x=3)
print(test)

Answer 1

If you really need a custom function, you can use apply :如果你真的需要一个定制的 function，你可以使用apply ：

# Modified slightly to make using it easier~
def myfunc(x, df2):
    return df2["set1"] + df2["set2"] * x

df1 = df1.join(df1.x.apply(myfunc, args=(df2,)).add_prefix('run'))
print(df1)

# Output:

   x  run0  run1  run2  run3
0  1   100   200   300   400
1  2   200   400   600   800
2  3   300   600   900  1200

That said, there's often a way to do whatever you want to do using pandas methods:也就是说，通常有一种方法可以使用 pandas 方法做任何你想做的事情：

df = df1.merge(df2, 'cross')
df['value'] = df.set1 + df.set2 * df.x
df['run'] = df.groupby('x')['value'].cumcount() + 1
df = df.pivot(index='x', columns='run', values='value')
df.columns = [f'{df.columns.name}{x}' for x in df.columns]
print(df.reset_index())

# Output:

   x  run1  run2  run3  run4
0  1   100   200   300   400
1  2   200   400   600   800
2  3   300   600   900  1200

Answer 2

You can do你可以做

df1 = df1.join(df1.apply(lambda x  : myfunc(df2, x['x']),axis=1))
Out[152]: 
   x    0    1    2     3
0  1  100  200  300   400
1  2  200  400  600   800
2  3  300  600  900  1200

Answer 3

This is specific to your example myfunc but it is possible to vectorize with dot这特定于您的示例myfunc但可以使用dot进行矢量化

df1[['x']].dot(
    df2['set1'].add(df2['set2']).to_frame().T.values
).rename(
    columns={i:f'run{i+1}' for i in df2.index}
).assign(
    x = df1['x'],
)

Pandas 将自定义 function 应用到每个 dataframe 结果和 Z9516DFB15F51C7EE19A4D46DZC0

问题描述

3 个解决方案

解决方案1
1 2022-09-07 03:33:51

解决方案2
0 2022-09-07 00:30:50

解决方案3
0 2022-09-07 00:46:51

Pandas 将自定义 function 应用到每个 dataframe 结果和 Z9516DFB15F51C7EE19A4D46DZC0

问题描述

3 个解决方案

解决方案1 1 2022-09-07 03:33:51

解决方案2 0 2022-09-07 00:30:50

解决方案3 0 2022-09-07 00:46:51

解决方案1
1 2022-09-07 03:33:51

解决方案2
0 2022-09-07 00:30:50

解决方案3
0 2022-09-07 00:46:51