繁体   English   中英

Py2neo Neo4j批量提交错误

[英]Py2neo Neo4j Batch submit error

我有一个约有140万个节点数据的json文件,我想为此构建一个Neo4j图形数据库。 我试图使用py2neo的批处理提交功能。 我的代码如下:

# the variable words is a list containing node names
from py2neo import neo4j
batch = neo4j.WriteBatch(graph_db)
nodedict = {}
# I decided to use a dictionary because I would be creating relationships
# by referring to the dictionary entries later
for i in words:
    nodedict[i] = batch.create({"name":i})
results = batch.submit()

显示的错误如下:

Traceback (most recent call last):
  File "test.py", line 36, in <module>
    results = batch.submit()
  File "/usr/lib/python2.6/site-packages/py2neo/neo4j.py", line 2116, in submit
    for response in self._submit()
  File "/usr/lib/python2.6/site-packages/py2neo/neo4j.py", line 2085, in _submit
    for id_, request in enumerate(self.requests)
  File "/usr/lib/python2.6/site-packages/py2neo/rest.py", line 427, in _send
    return self._client().send(request)
  File "/usr/lib/python2.6/site-packages/py2neo/rest.py", line 364, in send
    return Response(request.graph_db, rs.status, request.uri, rs.getheader("Loc$
  File "/usr/lib/python2.6/site-packages/py2neo/rest.py", line 278, in __init__
    raise SystemError(body)
SystemError: None

有人可以告诉我这里到底发生了什么吗? 它与批处理查询很大有关吗? 如果可以,该怎么办? 提前致谢! :)

因此,这就是我的想法(由于这个问题: py2neo-Neo4j-系统错误-创建批处理节点/关系 ):

py2neo批处理提交功能在可以进行的查询方面有其自身的局限性。 虽然无法获得确切的上限数量,但我尝试将每批查询的数量限制为5000。因此,我决定运行以下代码:

# the variable words is a list containing node names
from py2neo import neo4j
batch = neo4j.WriteBatch(graph_db)
nodedict = {}
# I decided to use a dictionary because I would be creating relationships
# by referring to the dictionary entries later

for index, i in enumerate(words):
    nodedict[i] = batch.create({"name":i})
    if index%5000 == 0:
        batch.submit()
        batch = neo4j.WriteBatch(graph_db) # As stated by Nigel below, I'm creating a new batch
batch.submit() #for the final batch

这样,我发送了批处理请求(大小为5k的查询),并成功地创建了整个图形!

没有真正的方法来描述批处理中可以包含的作业数量的限制-它会根据多种因素而千差万别。 通常,最好的选择是尝试为您的用例找到最佳尺寸,然后再选择最佳尺寸。 看来这就是您已经在做的:-)

根据您的解决方案,我建议您进行一项调整。 批处理对象的设计目的不是要重用,因此与其在每次提交后清除批处理,不如创建一个新的批处理对象。 无论如何,多次提交批处理的功能将在下一版py2neo中删除。

我开始通过graph.create(* alist)使用批处理创建后遇到了相同的问题。 上面的答案为我指明了正确的方向,我最终使用了受此问题启发的https://gist.github.com/anonymous/6293739的 摘要py2neo-Neo4j-系统错误-创建批处理节点/关系

chunk_size=500
chunks=(alist[pos:pos + chunk_size] for pos in xrange(0, len(alist), chunk_size))
for c in chunks:
    graph.create(*c)

PS py2neo == 2.0.7

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM