简体   繁体   中英

Type Error: Pandas Dataframe apply function, argument passing

By default, columns are all set to zero. Make entry as 1 at (row,column) where column name string present on URL column

L # list that contains column names used to check if found on URL

Dataframe Image

def generate(statement,col):
    if statement.find(col) == -1:
      return 0
    else:
      return 1

for col in L:
  df3[col].apply(generate, args=(col))

I am a beginner, it throws and error:

/usr/local/lib/python3.6/dist-packages/pandas/core/series.py in f(x)
4195 4196 def f(x): -> 4197 return func(x, *args, **kwds) 4198 4199 else:

TypeError: generate() takes 2 positional arguments but 9 were given

Any suggestions would be helpful

Edit 1:

after,

df3[col].apply(generate, args=(col,))

got error:

> --------------------------------------------------------------------------- AttributeError                            Traceback (most recent call
> last) <ipython-input-162-508036a6e51f> in <module>()
>       1 for col in L:
> ----> 2   df3[col].apply(generate, args=(col,))
> 
> 2 frames pandas/_libs/lib.pyx in pandas._libs.lib.map_infer()
> 
> <ipython-input-159-9380ffd36403> in generate(statement, col)
>       1 def generate(statement,col):
> ----> 2     if statement.find(col) == -1:
>       3         return 0
>       4     else:
>       5         return 1
> 
> AttributeError: 'int' object has no attribute 'find'

Edit 2: "I missed to emphasize on URL column in for loop code will rectify that"

Edit 3: Updated and fixed to,

def generate(statement,col):
    if col in str(statement):
        return 1
    else:
        return 0

for col in L:
  df3[col] = df3['url'].apply(generate, col=col)

Thanks for all the support!

创建 1 元素元组时,元素后需要一个逗号:args=(col,),否则括号将被忽略。

This seems to be a problem with passing parameter in args . args in apply function will take the input as tuples and the same will be passed to the function.

Lets see one example to describe it,

df = pd.DataFrame([['xyz', 'US'],['abc', 'MX'],['xyz', 'CA']], columns = ["Name", "Country"])

print(df)

Name    Country
xyz     US
abc     MX
xyz     CA

Create a function as required with extra arguments,

def generate(statement,col):
    if statement.find(col) == -1:
        return 0
    else:
        return 1

Consider L as the list, ['Name', 'Country']

Now, Lets apply the function generate with extra arguments in loop

for col in L:
    print(df[col].apply(generate, args=(col)))


TypeError: generate() takes 2 positional arguments but 5 were given

Now, we could see the error occurs because (col) is a single element in tuple and so the args will take input as args=('N', 'A', 'M', 'E') . Along with statement now extra 4 inputs were given instead of just 1.

To avoid this situation, you can follow either of the below options

  1. Assign the col value to the parameter itself directly
df[col].apply(generate, col=col)
  1. Pass the arguments in tuple separated by commas. Note that for a single element tuple add one comma at the end .
df[col].apply(generate, args=(col,))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM