简体   繁体   中英

read json into multiple spark dataframes using scala

my json structure is something like this:

{
  "posts": [],
  "persons": [],
  "organizations": [],
  "meta": {
    "sources": [
      "http://loksabha.nic.in/",
      "http://wikidata.org/",
      "http://gender-balance.org/"
    ]
  },
  "memberships": [],
  "events": [],
  "areas": []
}

i want to read posts into a dataframe, wehre posts is an array of json objects. similarly other json arrays, except "meta". "sources" array inside "meta" json object should be read into another dataframe.

Is there anyway to achieve this with spark scala.

Any help is greatly appreciated.

Thanks in advance Shakti

You could use the expand function. I guess you have something like

val jsonDf = spark.read.json("your_json.json")
val postsDf = jsonDF.withColumn("post", explode(col("posts")).select("post")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM