I have the following file: test.json >
{
"id": 1,
"name": "A green door",
"price": 12.50,
"tags": ["home", "green"]
}
I want to load this file into a RDD. This is what I tried:
rddj = sc.textFile('test.json')
rdd_res = rddj.map(lambda x: json.loads(x))
I got an error:
Expecting object: line 1 column 1 (char 0)
I don't completely understand what does json.loads
do.
How can I resolve this problem ?
textFile
reads data line by line. Individual lines of your input are not syntactically valid JSON.
Just use json reader:
spark.read.json("test.json", multiLine=True)
or (not recommended) whole text files
sc.wholeTextFiles("test.json").values().map(json.loads)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.