简体   繁体   English

用python写入后加载和读取JSON文件很慢

[英]Loading and reading JSON file is very slow after writing in python

I'm trying to read in a JSON file, determine how many words are in the "text" field, add that information as a new field, "length", and write the new JSON object to a file.我正在尝试读取 JSON 文件,确定“文本”字段中有多少个单词,将该信息添加为新字段“长度”,并将新的 JSON 对象写入文件。 I've done that with the following code:我已经使用以下代码完成了该操作:

import json

with open("file_read.json", "r") as review_file, open(
    "file_write.json", "w") as review_write:
    for line in review_file:
        review_object = json.loads(line)
        review_object["length"] = len(review_object["text"].split())
        json.dump(review_object, review_write)

The original file is over 200mb, but I can view it alright with Vim;原始文件超过200mb,但我可以用Vim查看它; however, the file I write which is only 3mb larger takes a very long time to load if it loads at all.然而,我写的文件只有 3mb 大,如果加载的话需要很长时间才能加载。 Furthermore, even if I read only the first JSON object, there are issues.此外,即使我只读取了第一个 JSON 对象,也存在问题。 I tried the following after writing the file:写入文件后,我尝试了以下操作:

with open("file_write", "r") as review_file:
    print review_file.readline()
    print("abcd123")

I'm using Vim with python-mode, and when I traverse the first printed statement with the JSON info it is very choppy, but the second statement is not.我在 python 模式下使用 Vim,当我用 JSON 信息遍历第一个打印的语句时,它非常不稳定,但第二个语句不是。

The way you are writing your file, you will have only one HUGE line.按照您编写文件的方式,您将只有一个巨大的行。

# example
json.dump([1,2,3], fp)
json.dump({"name": "abc"}, fp)
json.dump(33, fp)
# content of file
# [1, 2, 3]{"name": "abc"}33

This may explain why it is so slow to read: it will have to load ~200mb of text in one time.这或许可以解释为什么阅读如此缓慢:它必须一次性加载约 200 mb 的文本。 Also loading it as json will probably fail.也将其加载为 json 可能会失败。

To solve it you can use instead:要解决它,您可以改用:

fp.write(json.dumps(review_object) + "\n")

I have the same problem.我也有同样的问题。 When I process a file JSON 64MB.当我处理文件 JSON 64MB 时。 I need a processing time of 9h.我需要 9 小时的处理时间。 It is very slow.它非常缓慢。 Here is my solution to increase the processing speed.这是我提高处理速度的解决方案。 B1: I changed the file from JSON to TXT. B1:我将文件从 JSON 更改为 TXT。 B2: I changed my processing algorithm in the text file. B2:我在文本文件中更改了我的处理算法。 B3: After the text file has the correct content. B3:文本文件内容正确后。 I request to change the file from text to JSON.我请求将文件从文本更改为 JSON。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM