I am working with Pyspark 2.2 rn, my code crashes by this function and I don't understand why it's crashed on ->
code
import Pyspark.sql.functions as F
t.withColumns('column_name',
F.expr("aggregate(column, '', (acc, x) -> acc || concat(x, 4) ','))"))
error like - "extraneous input '>' expecting 'all of sql functions'"
ty for ur help
Three suggestions.
Remove the ','))
: those parentheses are unmatched.
Use either …||…
or concat(…,…)
; no need for both.
Because the column expression contains >
and the column is not aliased, spark is trying to turn the expression itself into the column name, but column names cannot contain >
, as the OP's error message alludes to.
Solution: Alias the column.
(See screenshot. Yes, the screenshot uses spark SQL instead of pyspark, but the OP's issue is with the spark SQL snippet within the pyspark.)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.