使用df.apply（）将具有参数的函数应用于每一行

Question

I've seen enough SO questions about using pandas df.apply() function when the function being applied is super trivial (like .upper(), or simple multiplication). 当所应用的函数非常琐碎（例如.upper（）或简单乘法）时，我已经看到足够多的关于使用df.apply（）函数的问题。 However, when I try to apply my custom function, I keep getting all sorts of errors. 但是，当我尝试应用自定义函数时，会不断出现各种错误。 I don't know where to start with this error: 我不知道从哪里开始这个错误：

Here is my simplified example: 这是我的简化示例：

My fake data: 我的虚假数据：

inp = [{'c1':10, 'c2':1}, {'c1':11,'c2':110}, {'c1':12,'c2':0}]
df1 = pd.DataFrame(inp)
print(df1)

My fake function 我的假功能

def fake_funk(row, upper, lower):
    if lower < row['c1'] < upper:
        return(1)
    elif row['c2'] > upper:
        return(2)
    else:
        return(0)

Testing that it does in fact work: 测试它确实有效：

for index, row in df1.iterrows():
    print(fake_funk(row,11,1))
1
2
0

Now using apply() 现在使用apply（）

df1.apply(lambda row,: fake_funk(row,11,1))

The error I am getting is pretty long: 我得到的错误很长：

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5126)()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item (pandas/_libs/hashtable.c:14010)()

TypeError: an integer is required

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-116-a554e891e761> in <module>()
----> 1 df1.apply(lambda row,: fake_funk(row,11,1))

/usr/local/anaconda3/lib/python3.5/site-packages/pandas/core/frame.py in apply(self, func, axis, broadcast, raw, reduce, args, **kwds)
   4260                         f, axis,
   4261                         reduce=reduce,
-> 4262                         ignore_failures=ignore_failures)
   4263             else:
   4264                 return self._apply_broadcast(f, axis)

/usr/local/anaconda3/lib/python3.5/site-packages/pandas/core/frame.py in _apply_standard(self, func, axis, ignore_failures, reduce)
   4356             try:
   4357                 for i, v in enumerate(series_gen):
-> 4358                     results[i] = func(v)
   4359                     keys.append(v.name)
   4360             except Exception as e:

<ipython-input-116-a554e891e761> in <lambda>(row)
----> 1 df1.apply(lambda row,: fake_funk(row,11,1))

<ipython-input-115-e95f3470fb25> in fake_funk(row, upper, lower)
      1 def fake_funk(row, upper, lower):
----> 2     if lower < row['c1'] < upper:
      3         return(1)
      4     elif row['c2'] > upper:
      5         return(2)

/usr/local/anaconda3/lib/python3.5/site-packages/pandas/core/series.py in __getitem__(self, key)
    599         key = com._apply_if_callable(key, self)
    600         try:
--> 601             result = self.index.get_value(self, key)
    602 
    603             if not is_scalar(result):

/usr/local/anaconda3/lib/python3.5/site-packages/pandas/core/indexes/base.py in get_value(self, series, key)
   2475         try:
   2476             return self._engine.get_value(s, k,
-> 2477                                           tz=getattr(series.dtype, 'tz', None))
   2478         except KeyError as e1:
   2479             if len(self) > 0 and self.inferred_type in ['integer', 'boolean']:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value (pandas/_libs/index.c:4404)()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value (pandas/_libs/index.c:4087)()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5210)()

KeyError: ('c1', 'occurred at index c1')

Answer 1

By default, apply operates along the 0th axis. 默认情况下， apply沿第0轴进行操作。 It seems you need an operation along the 1st axis. 似乎您需要沿第一轴进行操作。 By the way, you don't need a lambda either. 顺便说一句，您也不需要lambda 。 Just pass an args parameter, which should be enough. 只需传递一个args参数，就足够了。

df1.apply(fake_funk, axis=1, args=(11, 1))

0    1
1    2
2    0
dtype: int64

使用df.apply（）将具有参数的函数应用于每一行

问题描述

1 个解决方案

解决方案1
3 已采纳 2017-10-19 00:10:41

使用df.apply（）将具有参数的函数应用于每一行

问题描述

1 个解决方案

解决方案1 3 已采纳 2017-10-19 00:10:41

解决方案1
3 已采纳 2017-10-19 00:10:41