[英]How to Read Huge and Valid JSON File Line by Line in Python
I've been trying to use this code to read a huge JSON file (It contains 80+ million records) line by line:我一直在尝试使用此代码逐行读取一个巨大的 JSON 文件(它包含 80+ 百万条记录):
import json
import pandas as pd
lines = []
with open('file_path','r') as f:
for line in f:
lines.append(json.loads(line))
df = pd.DataFrame(lines)
But this gives an error:但这给出了一个错误:
JSONDecodeError: Expecting property name enclosed in double quotes
Then, I used replace function with below code,然后,我用下面的代码替换函数,
import json
import pandas as pd
lines = []
jstr = ""
with open('filepath','r') as f:
for line in f:
jstr = f'{jstr}{line}'
jstr = line.replace("'", '"')
lines.append(json.loads(jstr))
df = pd.DataFrame(lines)
But I can only read first six rows and then I got this error:但我只能读取前六行,然后出现此错误:
JSONDecodeError: Expecting ',' delimiter
It is ensured that json is a valid format but I don't know what to do.确保 json 是有效格式,但我不知道该怎么做。
Would anyone help me how to handle this problem?有人能帮我解决这个问题吗?
Maybe are you searching this?也许你在搜索这个?
from pandas as pd
df = pd.read_json('data/simple.json')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.