简体   繁体   English

如何读取由新行分隔的多个json对象的json文件?

[英]How to read json file which has multiple json objects seperated by new line?

I want to read a json file in which each line contains a new json object. 我想读取一个json文件,其中每行包含一个新的json对象。

File looks like below - 文件如下所示 -

{'P':'a1','D':'b1','T':'c1'}
{'P':'a2','D':'b2','T':'c2'}
{'P':'a3','D':'b3','T':'c3'}
{'P':'a4','D':'b4','T':'c4'}

I'm trying to read this file like below - 我正试着读下面这个文件 -

print pd.read_json("sample.json", lines = True)

I'm facing below exception - 我面临以下异常 -

ValueError: Expected object or value

Actually this sample.json file is of ~240mb. 实际上这个sample.json文件大约是240mb。 Format of this file is like this only. 此文件的格式仅限于此。 It's each line contains one new json object and I want to read this file using python pandas . 它的每一行包含一个新的json对象,我想用python pandas读取这个文件。

As others have said in the comments, it's not really JSON. 正如其他人在评论中所说,它并不是真正的JSON。 You can use ast.literal_eval() : 你可以使用ast.literal_eval()

import pandas as pd
import ast

with open('sample.json') as f:
    content = f.readlines()

pd.DataFrame([ast.literal_eval(line) for line in content])

Or replace the single quotes with doubles: 或者用双打代替单引号:

import pandas as pd
import json

with open('sample.json') as f:
    content = f.readlines()

pd.DataFrame([json.loads(line.replace("'", '"')) for line in content])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM