简体   繁体   中英

aggregate function Spark SQL

I am working with Pyspark 2.2 rn, my code crashes by this function and I don't understand why it's crashed on ->

code

import Pyspark.sql.functions as F
t.withColumns('column_name',
              F.expr("aggregate(column, '', (acc, x) -> acc || concat(x, 4) ','))"))

error like - "extraneous input '>' expecting 'all of sql functions'"

ty for ur help

Three suggestions.

  1. Remove the ',')) : those parentheses are unmatched.

  2. Use either …||… or concat(…,…) ; no need for both.

  3. Because the column expression contains > and the column is not aliased, spark is trying to turn the expression itself into the column name, but column names cannot contain > , as the OP's error message alludes to.

    Solution: Alias the column.

    (See screenshot. Yes, the screenshot uses spark SQL instead of pyspark, but the OP's issue is with the spark SQL snippet within the pyspark.)

具有函数聚合的 Spark SQL 列必须具有别名

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM