簡體   English   中英

並行 Cassandra 請求在 wait() 上使用 Python 多處理庫錯誤

[英]Parallel Cassandra requests using Python multiprocessing library errors on wait()

我寫了一段多處理代碼。 連接到 cassandra,我在那里運行 32 個查詢來獲取數據。 我嘗試使用 python 中的多處理庫來並行化提取。 代碼看起來像這樣。

    from cassandra.cluster import Cluster
    cluster = Cluster(['xyz'])
    session = cluster.connect()

    query = session.prepare('SELECT stuff')
    session.default_timeout = 600000
    session.default_fetch_size = 100
    queries = [
        session.execute_async(query, ['2021-10-19'] + [i])
        for i in range(32)
    ]
    pool = mp.Pool(32)
    inter_obj = pool.map_async(compute, queries)
    inter_obj.wait()
    res = inter_obj.get()

    pool.close()
    pool.join()
    final_response = reduce(aggregate, res)
    resp = json.dumps(final_response, sort_keys=True, indent=4).encode("utf-8")
    print("RESPONSE", resp)

在運行程序時,它在 wait() 上出錯

Traceback (most recent call last):
  File "/usr/local/bin/date-run", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.8/dist-packages/sc_eol/run_stuff.py", line 75, in main
    res = inter_obj.get()
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 768, in get
    raise self._value
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 537, in _handle_tasks
    put(task)
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 206, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "/usr/lib/python3.8/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
TypeError: cannot pickle '_thread.RLock' object

execute_async()返回一個ResponseFuture對象。 您最好使用以下內容構建“期貨”列表:

futures = []
query = ...
for ... :
    futures.append(session.execute_async(query, ...)

這種方法並發執行查詢。 然后,您可以使用以下方法迭代結果:

for future in futures:
    rows = future.result()
    # insert processing here

result()的調用被阻塞,直到請求返回結果或錯誤。

有關詳細信息,請參閱 Cassandra Python 驅動程序入門指南。 干杯!

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM