如何遍历JSON对象的块？

Question

I use the following function to chunk iterable python objects. 我使用以下函数对可迭代的python对象进行分块。

from itertools import islice

def chunked_iterable(iterable, chunk_size):
    it = iter(iterable)
    while True:
        chunk = tuple(islice(it, chunk_size))
        if not chunk:
            break
        yield chunk

I'm looking to do something similar with a basic JSON file. 我正在寻找与基本JSON文件类似的内容。

[
    {object1: 'object1'},
    {object2: 'object2'},
    {object3: 'object3'},
    {object4: 'object4'},
    {object5: 'object5'},
    {object6: 'object6'},
    etc...
]

Like this. 像这样。

from pathlib import Path
import json

def json_chunk(json_array_of_objects, object_count):
    # What goes here?

if __name__ == '__main__':
    with open(Path(__file__).parent / 'raw_data.json') as raw_data:
        json_data = json.load(raw_data)

    for json_array_with_five_objects in enumerate(json_chunk(json_data, 5)): 
        for object in json_array_with_five_objects:
            print(object[0])

Is the term I'm looking for "streaming" JSON data? 我正在寻找“流式” JSON数据的术语吗？
How do you stream JSON data? 您如何流式传输JSON数据？

As a learning exercise I'm trying to stick with base python functionality for now but answers using other packages are helpful too. 作为一项学习练习，我现在尝试使用基本的python功能，但使用其他软件包的答案也很有帮助。

Answer 1

After further thought, using object_hook or object_pairs_hook arguments would require reading the entire file into memory first—so to avoid doing that, instead here's something that reads the file incrementally, line-by-line. 进一步考虑之后，使用object_hook或object_pairs_hook参数将需要首先将整个文件读取到内存中-为避免这样做，这是一种逐行递增读取文件的方法。

I had to modify your example JSON file to make it valid JSON (what you have in your question is a Python dictionary). 我必须修改示例JSON文件以使其有效JSON（问题中的内容是Python字典）。 Note that this code is format-specific in the sense that it assumes each JSON object in the array lies entirely on a single line—although it could be changed to handle multiline object definitions if necessary. 请注意，此代码是特定于格式的，即它假定数组中的每个JSON对象都完全位于同一行上，尽管可以根据需要将其更改为处理多行对象定义。

So here's a sample test input file with valid JSON contents: 因此，这是带有有效JSON内容的示例测试输入文件：

[
    {"thing1": "object1"},
    {"thing2": "object2"},
    {"thing3": "object3"},
    {"thing4": "object4"},
    {"thing5": "object5"},
    {"thing6": "object6"}
]

Code: 码：

from itertools import zip_longest
import json
from pathlib import Path

def grouper(n, iterable, fillvalue=None):
    """ s -> (s0, s1...sn-1), (sn, sn+1...s2n-1), (s2n, s2n+1...s3n-1), ... """
    return zip_longest(*[iter(iterable)]*n, fillvalue=fillvalue)

def read_json_objects(fp):
    """ Read objects from file containing an array of JSON objects. """
    next(fp)  # Skip first line.
    for line in (line.strip() for line in fp):
        if line[0] == ']':  # Last line?
            break
        yield json.loads(line.rstrip(','))

def json_chunk(json_file_path, object_count):
    with open(json_file_path) as fp:
        for group in grouper(object_count, read_json_objects(fp)):
            yield(tuple(obj for obj in group if obj is not None))


if __name__ == '__main__':
    json_file_path = Path(__file__).parent / 'raw_data.json'

    for array in json_chunk(json_file_path, 5):
        print(array)

Output from processing test file: 处理测试文件的输出：

({'thing1': 'object1'}, {'thing2': 'object2'}, {'thing3': 'object3'}, {'thing4': 'object4'}, {'thing5': 'object5'})
({'thing6': 'object6'},)

Answer 2

JSON is a text format that is completely language independent but uses conventions that are familiar to programmers of the C-family of languages, including C, C++, C#, Java, JavaScript, Perl, Python, and many others. JSON是一种完全独立于语言的文本格式，但是使用C语言家族（包括C，C ++，C＃，Java，JavaScript，Perl，Python等）的程序员熟悉的约定。 These properties make JSON an ideal data-interchange language. 这些属性使JSON成为理想的数据交换语言。 - https://www.json.org/ -https://www.json.org/

JSON is a string of text. JSON是文本字符串。 You would need to convert it back to python to be iteratable 您需要将其转换回python以进行迭代

如何遍历JSON对象的块？

问题描述

2 个解决方案

解决方案1
2 已采纳 2018-03-21 22:03:41

解决方案2
0 2018-03-21 18:18:49

如何遍历JSON对象的块？

问题描述

2 个解决方案

解决方案1 2 已采纳 2018-03-21 22:03:41

解决方案2 0 2018-03-21 18:18:49

解决方案1
2 已采纳 2018-03-21 22:03:41

解决方案2
0 2018-03-21 18:18:49