簡體   English   中英

Py2neo Neo4j批量提交錯誤

[英]Py2neo Neo4j Batch submit error

我有一個約有140萬個節點數據的json文件,我想為此構建一個Neo4j圖形數據庫。 我試圖使用py2neo的批處理提交功能。 我的代碼如下:

# the variable words is a list containing node names
from py2neo import neo4j
batch = neo4j.WriteBatch(graph_db)
nodedict = {}
# I decided to use a dictionary because I would be creating relationships
# by referring to the dictionary entries later
for i in words:
    nodedict[i] = batch.create({"name":i})
results = batch.submit()

顯示的錯誤如下:

Traceback (most recent call last):
  File "test.py", line 36, in <module>
    results = batch.submit()
  File "/usr/lib/python2.6/site-packages/py2neo/neo4j.py", line 2116, in submit
    for response in self._submit()
  File "/usr/lib/python2.6/site-packages/py2neo/neo4j.py", line 2085, in _submit
    for id_, request in enumerate(self.requests)
  File "/usr/lib/python2.6/site-packages/py2neo/rest.py", line 427, in _send
    return self._client().send(request)
  File "/usr/lib/python2.6/site-packages/py2neo/rest.py", line 364, in send
    return Response(request.graph_db, rs.status, request.uri, rs.getheader("Loc$
  File "/usr/lib/python2.6/site-packages/py2neo/rest.py", line 278, in __init__
    raise SystemError(body)
SystemError: None

有人可以告訴我這里到底發生了什么嗎? 它與批處理查詢很大有關嗎? 如果可以,該怎么辦? 提前致謝! :)

因此,這就是我的想法(由於這個問題: py2neo-Neo4j-系統錯誤-創建批處理節點/關系 ):

py2neo批處理提交功能在可以進行的查詢方面有其自身的局限性。 雖然無法獲得確切的上限數量,但我嘗試將每批查詢的數量限制為5000。因此,我決定運行以下代碼:

# the variable words is a list containing node names
from py2neo import neo4j
batch = neo4j.WriteBatch(graph_db)
nodedict = {}
# I decided to use a dictionary because I would be creating relationships
# by referring to the dictionary entries later

for index, i in enumerate(words):
    nodedict[i] = batch.create({"name":i})
    if index%5000 == 0:
        batch.submit()
        batch = neo4j.WriteBatch(graph_db) # As stated by Nigel below, I'm creating a new batch
batch.submit() #for the final batch

這樣,我發送了批處理請求(大小為5k的查詢),並成功地創建了整個圖形!

沒有真正的方法來描述批處理中可以包含的作業數量的限制-它會根據多種因素而千差萬別。 通常,最好的選擇是嘗試為您的用例找到最佳尺寸,然后再選擇最佳尺寸。 看來這就是您已經在做的:-)

根據您的解決方案,我建議您進行一項調整。 批處理對象的設計目的不是要重用,因此與其在每次提交后清除批處理,不如創建一個新的批處理對象。 無論如何,多次提交批處理的功能將在下一版py2neo中刪除。

我開始通過graph.create(* alist)使用批處理創建后遇到了相同的問題。 上面的答案為我指明了正確的方向,我最終使用了受此問題啟發的https://gist.github.com/anonymous/6293739的 摘要py2neo-Neo4j-系統錯誤-創建批處理節點/關系

chunk_size=500
chunks=(alist[pos:pos + chunk_size] for pos in xrange(0, len(alist), chunk_size))
for c in chunks:
    graph.create(*c)

PS py2neo == 2.0.7

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM