[英]what is the efficient way to run bulk of cypher?
I want to import data into my Neo4j database. 我想将数据导入我的Neo4j数据库。 From my raw data, I generate a lot of cypher. 根据我的原始数据,我生成了很多密码。
For example, I have a list of cypher like this (up to hundreds of thousand): 例如,我有一个像这样的密码列表(最多十万个):
MERGE (product:PRODUCT{name:'X phone'}) MERGE (product)-[:RATE]-(review:REVIEW{content:'worst phone ever'})
MERGE (product:PRODUCT{name:'X phone'}) MERGE (product)-[:RATE]-(review:REVIEW{content:'cheapest phone ever'})
MERGE (product:PRODUCT{name:'Y phone'}) MERGE (product)-[:RATE]-(review:REVIEW{content:'even worse than phone X'})
MERGE (product:PRODUCT{name:'X phone'}) MERGE (product)-[:RATE]-(review:REVIEW{content:'better than newly release Y version'})
My current solution is run the cypher line-by-line from file using Neo4j driver in Python. 我当前的解决方案是使用Python中的Neo4j驱动程序逐行从文件运行密码。
from neo4j.v1 import GraphDatabase
import sys
class CypherClient:
"""
The client that execute cypher
"""
def __init__(self, uri, auth):
self.driver = GraphDatabase.driver(uri, auth=auth)
def run_cypher(self, cypher):
"""
execute single cypher
:param cypher: the cypher in str
:return: no return anything at all
"""
with self.driver.session() as session:
session.run(cypher).single()
if __name__=="__main__":
"""
execute cypher from file
each line is independent cypher
python exec_cypher_file.py outcypher.txt
"""
# replace URI and authentication here
uri = "bolt://localhost:7687"
auth = ("neo4j", "IAmPusheenTheCat")
counter = 0
if len(sys.argv) < 2:
test()
else:
client = CypherClient(uri, auth)
infile = sys.argv[1]
errfile = open(infile+".err.txt", 'w')
for line in open(infile):
# print(line)
try:
client.run_cypher(line)
except:
print(str(counter) + " " + line+"\n")
errfile.write(str(counter) + " " + line+"\n")
counter+=1
if counter % 100 == 0 or counter < 100:
print(counter)
errfile.close()
print('done')
What could I do to improve the efficiency of running bulk cypher ? 我该怎么做才能提高运行批量密码的效率?
CSV loading tends to be very efficient, so if you have your data in CSV form you can use LOAD CSV . CSV加载往往非常高效,因此,如果您以CSV格式存储数据,则可以使用LOAD CSV 。
Otherwise, you can check Michael Hunger's article on effective batch updates which uses UNWIND to process a list of inputs as a batch. 否则,您可以查看Michael Hunger的有关有效批量更新的文章,该文章使用UNWIND 批量处理输入列表。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.