简体   繁体   English

如何在 Python 中逐行读取巨大且有效的 JSON 文件

[英]How to Read Huge and Valid JSON File Line by Line in Python

I've been trying to use this code to read a huge JSON file (It contains 80+ million records) line by line:我一直在尝试使用此代码逐行读取一个巨大的 JSON 文件(它包含 80+ 百万条记录):

import json
import pandas as pd


lines = []

with open('file_path','r') as f:
    for line in f:
            lines.append(json.loads(line))      
            df = pd.DataFrame(lines)

But this gives an error:但这给出了一个错误:

JSONDecodeError: Expecting property name enclosed in double quotes

Then, I used replace function with below code,然后,我用下面的代码替换函数,

import json
import pandas as pd


lines = []
jstr = ""


with open('filepath','r') as f:
    for line in f:
            jstr = f'{jstr}{line}'
            jstr = line.replace("'", '"')
            lines.append(json.loads(jstr))
            df = pd.DataFrame(lines)

But I can only read first six rows and then I got this error:但我只能读取前六行,然后出现此错误:

JSONDecodeError: Expecting ',' delimiter

It is ensured that json is a valid format but I don't know what to do.确保 json 是有效格式,但我不知道该怎么做。

Would anyone help me how to handle this problem?有人能帮我解决这个问题吗?

Maybe are you searching this?也许你在搜索这个?

from pandas as pd

df = pd.read_json('data/simple.json')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM