如何讀取 json 元數據文件的前 100 行並將它們寫入較小的 json 文件？ [Python]

Question

我有一個包含大約 26 GB 數據的 json 元數據文件。 出於顯而易見的原因，我需要提取前 100 行來創建一個新的 json 文件來讀取，這樣我對接下來的代碼的改動就更少了，這應該是為了在 100 行上進行測試，一旦調試完成就應用代碼在整個文件上。

我已經閱讀了將 json 導出到 csv 的內容，但我希望保持 json 結構和文件類型，是否可以使用 Python 這樣做？

我的文件是一個帶有一些額外數據的 json，所以我首先使用變通方法來讀取它。 它看起來像這樣：


{"_id":{"$oid":"5b9fd47507b317551a7bfb8f"},"title":"It's Okay If You Didn't Like 'Boyhood', And Here Are Many Reasons Why","url":"https://m.huffpost.com/us/entry/6694772","article_text"

我是這樣讀的

with open('metadata.json', 'r') as data:
    data = json.loads("[" + data.read().replace("}\n{", "},\n{") + "]")

謝謝！

Answer 1

你可以試試：

import json
with open('file.json') as ip_file:
  o = json.load(ip_file)
  chunkSize = 100
  for i in range(0, len(o), chunkSize):
    with open('new_file' + '.json', 'a') as out_file:
      json.dump(o[i:i+chunkSize], out_file)

如何讀取 json 元數據文件的前 100 行並將它們寫入較小的 json 文件？ [Python]

問題描述

1 個解決方案

解決方案1
0 2019-12-04 12:45:05

如何讀取 json 元數據文件的前 100 行並將它們寫入較小的 json 文件？ [Python]

問題描述

1 個解決方案

解決方案1 0 2019-12-04 12:45:05

解決方案1
0 2019-12-04 12:45:05