简体   繁体   English

将 sql 数据转换为 Json 数组 [java spark]

[英]Convert sql data to Json Array [java spark]

I have dataframe, wanted to convert into JSON ARRAY Please find the example below我有 dataframe,想转换成 JSON ARRAY 请在下面找到示例

Dataframe Dataframe

+------------+--------------------+----------+----------------+------------------+--------------
|     Name|                  id|request_id|create_timestamp|deadline_timestamp|
+------------+--------------------+----------+----------------+------------------+--------------
|    Freeform|59bbe3ad-f487-44| htvjiwmfe|   1589155200000|   1591272659556
|         D23|59bbe3ad-f487-44| htvjiwmfe|   1589155200000|   1591272659556
|      Stores|59bbe3ad-f487-44| htvjiwmfe|   1589155200000|   1591272659556
|VacationClub|59bbe3ad-f487-44| htvjiwmfe|   1589155200000|   1591272659556

Wanted in Json Like below:在 Json 中通缉如下:


[
   {
      "testname":"xyz",
      "systemResponse":[
         {
            "name":"FGH",
            "id":"59bbe3ad-f487-44",
            "request_id":1590791280,
            "create_timestamp":1590799280

         },
         {
           "name":"FGH",
            "id":"59bbe3ad-f487-44",
            "request_id":1590791280,
            "create_timestamp":1590799280,
         }
      ]
   }
]

  • You can define 2 beans您可以定义 2 个 bean
  • Create Array from the 1st DF as Array of inner Beans从第一个 DF 创建数组作为内部 Bean 数组
  • Define a parent bean with testname and requestDetailArray as Array用 testname 和 requestDetailArray 定义一个父 bean 作为 Array

Please also find code inline comments另请查找代码内联注释

object DataToJsonArray {

  def main(args: Array[String]): Unit = {

    val spark = Constant.getSparkSess

    import spark.implicits._

    //Load you dataframe
    val requestDetailArray = List(
      ("Freeform", "59bbe3ad-f487-44", "htvjiwmfe", "1589155200000", "1591272659556"),
      ("D23", "59bbe3ad-f487-44", "htvjiwmfe", "1589155200000", "1591272659556"),
      ("Stores", "59bbe3ad-f487-44", "htvjiwmfe", "1589155200000", "1591272659556"),
      ("VacationClub", "59bbe3ad-f487-44", "htvjiwmfe", "1589155200000", "1591272659556")
    ).toDF
      //Map your Dataframe to RequestDetails bean
      .map(row => RequestDetails(row.getString(0), row.getString(1), row.getString(2), row.getString(3), row.getString(4)))
      //Collect it as Array
      .collect() 

    //Create another data frme with List[BaseClass] and set the (testname,Array[RequestDetails])
    List(BaseClass("xyz", requestDetailArray)).toDF()
      .write
      //Output your Dataframe as JSON
      .json("/json/output/path")
  }

}

case class RequestDetails(Name: String, id: String, request_id: String, create_timestamp: String, deadline_timestamp: String)

case class BaseClass(testname: String = "xyz", systemResponse: Array[RequestDetails])

Check below code.检查下面的代码。

import org.apache.spark.sql.functions._

df.withColumn("systemResponse",
     array(
           struct("id","request_id","create_timestamp","deadline_timestamp").as("data")
         )
)
.select("systemResponse")
.toJSON
.select(col("value").as("json_data"))
.show(false)

+-----------------------------------------------------------------------------------------------------------------------------------------------+
|json_data                                                                                                                                      |
+-----------------------------------------------------------------------------------------------------------------------------------------------+
|{"systemResponse":[{"id":"59bbe3ad-f487-44","request_id":"htvjiwmfe","create_timestamp":"1589155200000","deadline_timestamp":"1591272659556"}]}|
|{"systemResponse":[{"id":"59bbe3ad-f487-44","request_id":"htvjiwmfe","create_timestamp":"1589155200000","deadline_timestamp":"1591272659556"}]}|
|{"systemResponse":[{"id":"59bbe3ad-f487-44","request_id":"htvjiwmfe","create_timestamp":"1589155200000","deadline_timestamp":"1591272659556"}]}|
|{"systemResponse":[{"id":"59bbe3ad-f487-44","request_id":"htvjiwmfe","create_timestamp":"1589155200000","deadline_timestamp":"1591272659556"}]}|
+-----------------------------------------------------------------------------------------------------------------------------------------------+

Updated更新

scala> :paste
// Entering paste mode (ctrl-D to finish)

df.withColumn("systemResponse",
     array(
           struct("id","request_id","create_timestamp","deadline_timestamp").as("data")
         )
)
.withColumn("testname",lit("xyz"))
.select("testname","systemResponse")
.toJSON
.select(col("value").as("json_data"))
.show(false)

// Exiting paste mode, now interpreting.

+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
|json_data                                                                                                                                                       |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
|{"testname":"xyz","systemResponse":[{"id":"59bbe3ad-f487-44","request_id":"htvjiwmfe","create_timestamp":"1589155200000","deadline_timestamp":"1591272659556"}]}|
|{"testname":"xyz","systemResponse":[{"id":"59bbe3ad-f487-44","request_id":"htvjiwmfe","create_timestamp":"1589155200000","deadline_timestamp":"1591272659556"}]}|
|{"testname":"xyz","systemResponse":[{"id":"59bbe3ad-f487-44","request_id":"htvjiwmfe","create_timestamp":"1589155200000","deadline_timestamp":"1591272659556"}]}|
|{"testname":"xyz","systemResponse":[{"id":"59bbe3ad-f487-44","request_id":"htvjiwmfe","create_timestamp":"1589155200000","deadline_timestamp":"1591272659556"}]}|
+----------------------------------------------------------------------------------------------------------------------------------------------------------------+

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM