简体   繁体   中英

Pyspark TypeError: 'NoneType' object is not callable when applying a UDF on dataframe column

I have created a dataframe with the below schema, I'm trying to extract the first 10 values in "contents.monid" of each row for which I created an UDF 'udfTop'.

>>> df.printSchema()
 |-- userid: long (nullable = true)
 |-- contents: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- monid: struct (nullable = true)
 |    |    |    |-- mon: string (nullable = true)
 |    |    |    |-- id: long (nullable = true)
 |    |    |-- count: integer (nullable = true)

>>> def take(n,data):
...     if data is null:
...             return null
...     else:
...             return data.take(n)

>>> udfTop = spark.udf.register("top_n", take)

But when I apply the udfTop on "contents" column's "monid" which is of struct type it gives me the error TypeError: 'NoneType' object is not callable although I've taken care of null values in UDF definition, also there are actually no null values in that column.

>>> new_df = df.withColumn("mon_ids", udfTop(10, "contents.monid"))
 Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 TypeError: 'NoneType' object is not callable

I was able to follow similar approach and got no errors in Spark-shell using Scala, but I want this to work using PySpark.

For a row in df with the 'contents' column value as:

[[Art,1111],100],[[Art,1112],110],[[Art,1113],120],[[Art,1114],130].....(100 such values)

After applying UDF, that row should give the value of 'mon_ids' column in new_df as:

[Art,1111],[Art,1112],[Art,1113],[Art,1114]....(10 values)

Issue was observed to be with my spark.udf.register syntax, modifying it to the below syntax and also changing take.data(n) to data[:n] has resolved the issue:

udfTop=udf(take,ArrayType(IntegerType()))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM