简体   繁体   English

运行大量密码的有效方法是什么?

[英]what is the efficient way to run bulk of cypher?

I want to import data into my Neo4j database. 我想将数据导入我的Neo4j数据库。 From my raw data, I generate a lot of cypher. 根据我的原始数据,我生成了很多密码。

For example, I have a list of cypher like this (up to hundreds of thousand): 例如,我有一个像这样的密码列表(最多十万个):

MERGE (product:PRODUCT{name:'X phone'}) MERGE (product)-[:RATE]-(review:REVIEW{content:'worst phone ever'})
MERGE (product:PRODUCT{name:'X phone'}) MERGE (product)-[:RATE]-(review:REVIEW{content:'cheapest phone ever'})
MERGE (product:PRODUCT{name:'Y phone'}) MERGE (product)-[:RATE]-(review:REVIEW{content:'even worse than phone X'})
MERGE (product:PRODUCT{name:'X phone'}) MERGE (product)-[:RATE]-(review:REVIEW{content:'better than newly release Y version'})

My current solution is run the cypher line-by-line from file using Neo4j driver in Python. 我当前的解决方案是使用Python中的Neo4j驱动程序逐行从文件运行密码。

from neo4j.v1 import GraphDatabase
import sys

class CypherClient:
    """
    The client that execute cypher
    """
    def __init__(self, uri, auth):
        self.driver = GraphDatabase.driver(uri, auth=auth)

    def run_cypher(self, cypher):
        """
        execute single cypher
        :param cypher: the cypher in str
        :return: no return anything at all
        """
        with self.driver.session() as session:
            session.run(cypher).single()

if __name__=="__main__":

    """
    execute cypher from file
    each line is independent cypher
    python exec_cypher_file.py outcypher.txt 
    """

    # replace URI and authentication here
    uri = "bolt://localhost:7687"
    auth = ("neo4j", "IAmPusheenTheCat")

    counter = 0

    if len(sys.argv) < 2:
        test()
    else:
        client = CypherClient(uri, auth)
        infile = sys.argv[1]
        errfile = open(infile+".err.txt", 'w')
        for line in open(infile):
            # print(line)
            try:
                client.run_cypher(line)
            except:
                print(str(counter) + " " + line+"\n")
                errfile.write(str(counter) + " " + line+"\n")
            counter+=1
            if counter % 100 == 0 or counter < 100:
                print(counter)
        errfile.close()
    print('done')

What could I do to improve the efficiency of running bulk cypher ? 我该怎么做才能提高运行批量密码的效率?

CSV loading tends to be very efficient, so if you have your data in CSV form you can use LOAD CSV . CSV加载往往非常高效,因此,如果您以CSV格式存储数据,则可以使用LOAD CSV

Otherwise, you can check Michael Hunger's article on effective batch updates which uses UNWIND to process a list of inputs as a batch. 否则,您可以查看Michael Hunger的有关有效批量更新的文章,该文章使用UNWIND 批量处理输入列表。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM