简体   繁体   中英

Convert Scala RDD Map Function to Pyspark

I'm trying to convert the following function from Scala to Pyspark::

DF.rdd.map(args => (args(0).toString, args.mkString("|"))).take(5)

For that, I am making the following map function:

DF.rdd.map(lambda line: ",".join([str(x) for x in line])).take(5)

But the Scala code gives me Array structure while in Python I am getting a delimited result.

How to convert the above scala code to python?

Your scala code returns a 2 element list from args.

Your python code is returning a comma joined string

This would return the same thing

lambda args: [str(args[0]), "|".join(map(str, args))]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM