简体   繁体   English

py2neo Cypher交易失败

[英]py2neo Cypher transactions failing

I'm trying to batch import millions of nodes through Py2Neo. 我正在尝试通过Py2Neo批量导入数百万个节点。 I don't know what's faster, the BatchWrite or the cipher.Transaction , but the latter seemed the best option as I need to split my batches. 我不知道什么更快, BatchWritecipher.Transaction ,但后者似乎是最好的选择,因为我需要分割我的批次。 However, when I try to execute a simple transaction, I receive a weird error. 但是,当我尝试执行一个简单的事务时,我收到一个奇怪的错误。

The python code: python代码:

session = cypher.Session("http://127.0.0.1:7474/db/data/") #error also w/o /db/data/

def init():
    tx = session.create_transaction()

    for ngram, one_grams in data.items():
         tx.append("CREATE "+str(n)+":WORD {'word': "+ngram+", 'rank': "+str(ngram_rank)+", 'prob': "+str(ngram_prob)+", 'gram': '0gram'}")
         tx.execute()  # line 69 in the error below

The error: 错误:

Traceback (most recent call last):
  File "Ngram_neo4j.py", line 176, in <module>
    init(rNgram_file="dataset_id.json")
  File "Ngram_neo4j.py", line 43, in init
    data = probability_items(data)
  File "Ngram_neo4j.py", line 69, in probability_items
    tx.execute()
  File "D:\datasets\GOOGLE~1\virtenv\lib\site-packages\py2neo\cypher.py", line 224, in execute
    return self._post(self._execute or self._begin)
  File "D:\datasets\GOOGLE~1\virtenv\lib\site-packages\py2neo\cypher.py", line 209, in _post
    raise TransactionError(error["code"], error["status"], error["message"])
KeyError: 'status'

I tried catching the exception: 我尝试捕获异常:

 except cypher.TransactionError as e:
        print("--------------------------------------------------------------------------------------------")
        print(e.status)
        print(e.message)

But never gets called. 但永远不会被召唤。 (maybe an error on my part?) (也许是我的错误?)

Regular insert using graph_db.create({"node:" node}) do work, but are incredibly slow (36hrs for 2.5M nodes) Note that the dataset consists of a series of JSON files, each with a structure to 5 levels deep. 使用graph_db.create({“node:”node})进行常规插入可以正常工作,但速度非常慢(对于2.5M节点为36小时)请注意,数据集由一系列JSON文件组成,每个文件的结构深度为5级。 I'd like to batch the last 2 levels (around 100 to 20.000 nodes per batch) 我想批量最后2个级别(每批大约100到20.000个节点)

--- EDIT --- ---编辑---

I'm using Py2Neo 1.6.1, Neo4j 2.0.0. 我正在使用Py2Neo 1.6.1,Neo4j 2.0.0。 Currently on Windows 7 (but also OSX Mav., CentOS 6) 目前在Windows 7上(还有OSX Mav。,CentOS 6)

The problem you're seeing is due to a last minute alteration in the way that Cypher transaction errors are reported by the Neo4j server. 您看到的问题是由于Neo4j服务器报告Cypher事务错误的方式的最后一分钟更改。 Py2neo 1.6 was built against M05/M06 and when a few features changed in RC1/GA, Py2neo broke in a few places. Py2neo 1.6是针对M05 / M06构建的,当RC1 / GA中的一些功能发生变化时,Py2neo在几个地方出现了问题。

This has been fixed for Py2neo 1.6.2 ( https://github.com/nigelsmall/py2neo/issues/224 ) but I do not yet know when I will get a chance to finish and release this version. Py2neo 1.6.2( https://github.com/nigelsmall/py2neo/issues/224 )已修复此问题,但我还不知道何时有机会完成并发布此版本。

What neo4j and py2neo versions are you using? 您使用的是neo4j和py2neo版本?

You should use parameters for your create statements. 您应该为create语句使用参数。

Can you check the server logs in data/logs and data/graph.db/messages.log for errors? 你能检查data/logsdata/graph.db/messages.log的服务器日志是否有错误?

If you have so much data to insert then perhaps direct batch-insertion would make more sense? 如果要插入如此多的数据,那么直接批量插入可能更有意义吗?

See: http://neo4j.org/develop/import 见: http//neo4j.org/develop/import

Two tools I wrote for this: 我为此写的两个工具:

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM