简体   繁体   中英

Create one from multiple column of a spark dataframe - Scala eqv of Python

I am using below in python to convert the key value from multiple columns (Dataframe interalexternalid having 2 columns InternalId and ExternalId returned from spark sql ) and merged it into single column "body" in Python.

jsonDf = interalexternalid.select(to_json(struct([interalexternalid[x] for x in interalexternalid.columns])).alias("body"))
display(jsonDf)

Results like this:

"body"
{"InternalId":480941,"ExternalId":"a020H00001Tt7NrQAJ"}
{"InternalId":480942,"ExternalId":"a020H00001Tt7NsQAJ"}

How can I achieve same in Scala? Sorry I am not in Scala world but trying to catch-up on specific area of scala if this is achievable same way as in Python

Please Don't mark this as not tried as its different world in scala from python where same code doesn't work

If I got you right, you can do that you need so:

val df = Seq(
  (480941, "a020H00001Tt7NrQAJ"),
  (480942, "a020H00001Tt7NsQAJ")
).toDF("InternalId", "ExternalId")
df.show()
+----------+------------------+
|InternalId|        ExternalId|
+----------+------------------+
|    480941|a020H00001Tt7NrQAJ|
|    480942|a020H00001Tt7NsQAJ|
+----------+------------------+

import org.apache.spark.sql.functions._
jsonDf = df.select(to_json(struct(df.columns.map(col):_*)).alias("body"))
jsonDf.show(truncate = false)

+-------------------------------------------------------+
|body                                                   |
+-------------------------------------------------------+
|{"InternalId":480941,"ExternalId":"a020H00001Tt7NrQAJ"}|
|{"InternalId":480942,"ExternalId":"a020H00001Tt7NsQAJ"}|
+-------------------------------------------------------+

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM