将 function 应用于 pandas 数据框的列

Question

这可以正常工作：

import pandas as pd

def fnc(m):
    return m+4

df = pd.DataFrame({"m": [1,2,3,4,5,6], "c": [1,1,1,1,1,1], "x":[5,3,6,2,6,1]})
df
# apply a self created function to a single column in pandas
df["y"] = df['m'].apply(fnc)
df

我试图修改上面的代码。 在这里，我需要将列m值添加到列c值并将结果分配给列y ：

import pandas as pd

def fnc(m,c):
    return m+c

df = pd.DataFrame({"m": [1,2,3,4,5,6], "c": [1,1,1,1,1,1], "x":[5,3,6,2,6,1]})
df
# apply a self created function to a single column in pandas
df["y"] = df[['m','c']].apply(fnc)
df

给我错误：

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
d:\del\asfasf.py in 
      8 df
      9 # apply a self created function to a single column in pandas
----> 10 df["y"] = df[['m','c']].apply(fnc)
     11 df

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\frame.py in apply(self, func, axis, raw, result_type, args, **kwds)
   6876             kwds=kwds,
   6877         )
-> 6878         return op.get_result()
   6879 
   6880     def applymap(self, func) -> "DataFrame":

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\apply.py in get_result(self)
    184             return self.apply_raw()
    185 
--> 186         return self.apply_standard()
    187 
    188     def apply_empty_result(self):

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\apply.py in apply_standard(self)
    294             try:
    295                 result = libreduction.compute_reduction(
--> 296                     values, self.f, axis=self.axis, dummy=dummy, labels=labels
    297                 )
    298             except ValueError as err:

pandas\_libs\reduction.pyx in pandas._libs.reduction.compute_reduction()

pandas\_libs\reduction.pyx in pandas._libs.reduction.Reducer.get_result()

TypeError: fnc() missing 1 required positional argument: 'c'

问题：如何更正我的第二个代码？ 如果可能，请提供答案标准 function 语法（不是 lambda 函数）

Answer 1

在 dataframe 中的 axis=1 中添加要考虑的轴，并访问 function 中的每一列。试试这个

def fnc(m):
    return (m.m+m.c)

df = pd.DataFrame({"m": [1,2,3,4,5,6], "c": [1,1,1,1,1,1], "x":[5,3,6,2,6,1]})
df["y"] = df[['m',"c"]].apply(fnc,axis=1)

或者你可以申请 df，没有 select "m" 和 "c" 列。

df["y"] = df.apply(fnc,axis=1)

output

Answer 2

试试这个，将axis to 1用于按列操作，将*x设置为拆包 arguments。

df["y"] = df[['m','c']].apply(lambda x : fnc(*x), axis=1)

将 function 应用于 pandas 数据框的列

问题描述

2 个解决方案

解决方案1
3 已采纳 2020-07-04 04:00:35

解决方案2
1 2020-07-04 03:57:32

将 function 应用于 pandas 数据框的列

问题描述

2 个解决方案

解决方案1 3 已采纳 2020-07-04 04:00:35

解决方案2 1 2020-07-04 03:57:32

解决方案1
3 已采纳 2020-07-04 04:00:35

解决方案2
1 2020-07-04 03:57:32