简体   繁体   中英

Convert spark dataframe to json using scala

I have a dataframe in the following format

ID  currency   account name    principal   interest
123    USD     Principal       1000        100
123    EUR     Principal       2000        50
123    USD     Interest        2000        100

I would like the json output in the following format:

{ 
       "id":"123",
       "principal_type":{ 
          "USD":1000,
          "EUR":2000
       },
       "interest_type":{ 
          "USD":100
       }
    }

Since the first two rows have account type of Principal it gets added in the principal type while the third row is of type Interest hence it gets added to interest_type with key being currency and the value being Principal or Interest depending on the type

you can try this way spark

scala> var dfdd = Seq((123,"USD","Principal" ,1000,100),(123,"EUR","Principal",2000,50),(123,"USD","Interest",2000,100)).toDF("ID","currency","account_name","principal","interest")

scala> dfdd.show()
+---+--------+------------+---------+--------+
| ID|currency|account_name|principal|interest|
+---+--------+------------+---------+--------+
|123|     USD|   Principal|     1000|     100|
|123|     EUR|   Principal|     2000|      50|
|123|     USD|    Interest|     2000|     100|
+---+--------+------------+---------+--------+
scala> var dfdd2 = dfdd.groupBy("ID","account_name").pivot("currency").agg(collect_list("principal"))
+---+------------+------+------+
| ID|account_name|   EUR|   USD|
+---+------------+------+------+
|123|    Interest|    []|[2000]|
|123|   Principal|[2000]|[1000]|
+---+------------+------+------+
//added .show() only for understanding purpose
scala> var dfdd3 = dfdd2.withColumn("account_type",struct($"account_name",$"EUR",$"USD")).drop("EUR","USD","account_name").groupBy("id").agg(collect_list("account_type").as("test"))


scala> dfdd3.toJSON.show(false)
+----------------------------------------------------------------------------------------------------------------------------+
|value                                                                                                                       |
+----------------------------------------------------------------------------------------------------------------------------+
|{"id":123,"test":[{"account_name":"Interest","EUR":[],"USD":[2000]},{"account_name":"Principal","EUR":[2000],"USD":[1000]}]}|
+----------------------------------------------------------------------------------------------------------------------------+

equal JSON format as your desired output 在此处输入图像描述

have look do let me know if you have any question related to same

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM