简体   繁体   中英

Spark - Scala : Read json file as dataframe doesn't work when json data is spread across multiple lines?

Json Data:

{ "blogID" : "FJY26J1333", "date" : "2012-04-01",
"name" : "vpxnksu", "comment" : "good stuff"} 
{"blogID" : "VSAUMDFGSD", "date" : "2012-04-12", "name" : "yhftrcx", "comment" : "another comment"}

Code:

val dataFrame=sqlContext.read.json("sample.json")
dataFrame.show()

Output:

_corrupt_record       blogID      comment          date        name
{ "blogID" : "FJY...  null        null             null        null
"name" : "vpxnksu...  null        null             null        null
 null                 VSAUMDFGSD  another comment  2012-04-12  yhftrcx

How can i read it as two records?

Make sure its one json object per line in the source file like this:

{ "blogID" : "FJY26J1333", "date" : "2012-04-01", "name" : "vpxnksu", "comment" : "good stuff"} 
{ "blogID" : "VSAUMDFGSD", "date" : "2012-04-12", "name" : "yhftrcx", "comment" : "another comment"}  

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM