简体   繁体   English

使用df.apply()将具有参数的函数应用于每一行

[英]Apply a function with arguments to each row using df.apply()

I've seen enough SO questions about using pandas df.apply() function when the function being applied is super trivial (like .upper(), or simple multiplication). 当所应用的函数非常琐碎(例如.upper()或简单乘法)时,我已经看到足够多的关于使用df.apply()函数的问题。 However, when I try to apply my custom function, I keep getting all sorts of errors. 但是,当我尝试应用自定义函数时,会不断出现各种错误。 I don't know where to start with this error: 我不知道从哪里开始这个错误:

Here is my simplified example: 这是我的简化示例:

My fake data: 我的虚假数据:

inp = [{'c1':10, 'c2':1}, {'c1':11,'c2':110}, {'c1':12,'c2':0}]
df1 = pd.DataFrame(inp)
print(df1)

My fake function 我的假功能

def fake_funk(row, upper, lower):
    if lower < row['c1'] < upper:
        return(1)
    elif row['c2'] > upper:
        return(2)
    else:
        return(0)

Testing that it does in fact work: 测试它确实有效:

for index, row in df1.iterrows():
    print(fake_funk(row,11,1))
1
2
0

Now using apply() 现在使用apply()

df1.apply(lambda row,: fake_funk(row,11,1))

The error I am getting is pretty long: 我得到的错误很长:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5126)()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item (pandas/_libs/hashtable.c:14010)()

TypeError: an integer is required

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-116-a554e891e761> in <module>()
----> 1 df1.apply(lambda row,: fake_funk(row,11,1))

/usr/local/anaconda3/lib/python3.5/site-packages/pandas/core/frame.py in apply(self, func, axis, broadcast, raw, reduce, args, **kwds)
   4260                         f, axis,
   4261                         reduce=reduce,
-> 4262                         ignore_failures=ignore_failures)
   4263             else:
   4264                 return self._apply_broadcast(f, axis)

/usr/local/anaconda3/lib/python3.5/site-packages/pandas/core/frame.py in _apply_standard(self, func, axis, ignore_failures, reduce)
   4356             try:
   4357                 for i, v in enumerate(series_gen):
-> 4358                     results[i] = func(v)
   4359                     keys.append(v.name)
   4360             except Exception as e:

<ipython-input-116-a554e891e761> in <lambda>(row)
----> 1 df1.apply(lambda row,: fake_funk(row,11,1))

<ipython-input-115-e95f3470fb25> in fake_funk(row, upper, lower)
      1 def fake_funk(row, upper, lower):
----> 2     if lower < row['c1'] < upper:
      3         return(1)
      4     elif row['c2'] > upper:
      5         return(2)

/usr/local/anaconda3/lib/python3.5/site-packages/pandas/core/series.py in __getitem__(self, key)
    599         key = com._apply_if_callable(key, self)
    600         try:
--> 601             result = self.index.get_value(self, key)
    602 
    603             if not is_scalar(result):

/usr/local/anaconda3/lib/python3.5/site-packages/pandas/core/indexes/base.py in get_value(self, series, key)
   2475         try:
   2476             return self._engine.get_value(s, k,
-> 2477                                           tz=getattr(series.dtype, 'tz', None))
   2478         except KeyError as e1:
   2479             if len(self) > 0 and self.inferred_type in ['integer', 'boolean']:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value (pandas/_libs/index.c:4404)()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value (pandas/_libs/index.c:4087)()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5210)()

KeyError: ('c1', 'occurred at index c1')

By default, apply operates along the 0th axis. 默认情况下, apply沿第0轴进行操作。 It seems you need an operation along the 1st axis. 似乎您需要沿第一轴进行操作。 By the way, you don't need a lambda either. 顺便说一句,您也不需要lambda Just pass an args parameter, which should be enough. 只需传递一个args参数,就足够了。

df1.apply(fake_funk, axis=1, args=(11, 1))

0    1
1    2
2    0
dtype: int64

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM