Python多重处理-通过多个进程修改JSON

Question

我正在尝试使用multiprocessing修改JSON文件。 我将能够将JSON拆分为多个块，以便每个进程只能访问和修改JSON的特定部分（因此可以确保没有两个进程想要修改同一属性）。 我的问题是，如何在流程之间共享JSON对象，以便更改能够反映在原始对象上？ 我知道， multiprocessing将对象作为副本传递，所以我需要使用Manager() ，但是我到底该怎么做呢？ 目前我有

def parallelUpdateJSON(datachunk):
    for feature in datachunk: 
        #modify chunk

def writeGeoJSON():
    with open('geo.geojson') as f:
        data = json.load(f)
    pool = Pool()
    for i in range(0, mp.cpu_count())):
        #chunk data into a list, so I get listofchunks = [chunk1, chunk2, etc.,]
        #where chunk1 = data[0:chunksize], chunk2 = data[chunksize:2*chunksize] etc.
    pool.map(parallelUpdateJSON, listofchunks)
    pool.close()
    pool.join()
    with open('test_parallel.geojson', 'w') as outfile:
        json.dump(data, outfile)

但是，当然，这会将块作为副本传递，因此原始data对象不会被修改。 我怎样才能使data实际上被流程修改？ 谢谢！

Answer 1

避免同步访问输出文件可能是一个更好的主意。 仅产生N个部分输出并将它们连接在一起成为json对象的属性会容易得多。 然后，您可以将该对象转储到文件中。

def process(work):
    return str(work[::-1])

if __name__ == "__main__":
    p = Pool()
    structure = json.loads("""
    { "list":
        [
            "the quick brown fox jumped over the lazy dog",
            "the quick brown dog jumped over the lazy fox"
        ]
    }
    """)
    structure["results"] = p.map(process, structure["list"])
    #print(json.dumps(structure))
    with open("result.json", "w") as f:
        json.dump(structure, f)

Python多重处理-通过多个进程修改JSON

问题描述

1 个解决方案

解决方案1
-1 2017-03-05 12:44:06

Python多重处理-通过多个进程修改JSON

问题描述

1 个解决方案

解决方案1 -1 2017-03-05 12:44:06

解决方案1
-1 2017-03-05 12:44:06