简体   繁体   English

优化neo4j密码导入查询

[英]Optimize neo4j cypher import query

I created an application in C# using Neo4j.Driver.V1 that reads from a CSV and writes it into a neo4j graph database. 我使用Neo4j.Driver.V1在C#中创建了一个应用程序,该应用程序从CSV读取并将其写入neo4j图形数据库。

My csv has 1000 records. 我的csv有1000条记录。 Each record is split into 5 nodes with relationships between them. 每个记录分为5个节点,它们之间具有关系。

The whole process is taking 1 min 11 seconds (including 1 second for my logic behind to build the query). 整个过程耗时1分11秒(其中包括1秒,这是我后面用来构建查询的逻辑)。

This is way too much considering they will be uploading millions of records. 考虑到他们将上传数百万条记录,这太多了。

Here is my query: 这是我的查询:

MERGE 
    (accountd71d278a8eeb468f9e4517ac1e007fe5:Account 
        { 
            number: '952'
        } )         
ON CREATE 
    SET accountd71d278a8eeb468f9e4517ac1e007fe5 += 
    { 
        number: '952', 
        balanceType: 2, 
        accountType: 2, 
        openDate: apoc.date.parse('7/9/2015', 'ms', 'm/d/YYYY')
    }

MERGE (account13aa03cd1b6d449e88a3e5e5a22353da:Account 
    { 
        number: '198'
    } ) 
ON CREATE 
    SET account13aa03cd1b6d449e88a3e5e5a22353da += 
    { 
        number: '198'
    } 

MERGE (transactionba1459c4f7854157be237e7365497fcf:Transaction 
    { 
        number: '1'
    } ) 

ON CREATE 
    SET transactionba1459c4f7854157be237e7365497fcf += 
    { 
        number: '1', 
        amount: 3717.81, 
        type: 2, 
        date: apoc.date.parse('2016-05-27', 'ms', 'YYYY-mm-dd')
    }  

MERGE (bank3679799504f54bed9f079848be9c6eff:Bank 
    { 
        code: 'MMBC'
    } ) 
ON CREATE 
    SET bank3679799504f54bed9f079848be9c6eff += 
    { 
        code: 'MMBC', 
        country: 'Mongolia'
    }  

MERGE (bank522b6b6ed04d40bd9d87d4ecc36fbde2:Bank 
    { 
        code: 'VALL'
    } ) 
ON CREATE 
    SET bank522b6b6ed04d40bd9d87d4ecc36fbde2 += 
    { 
        code: 'VALL', 
        country: 'Mongolia'
    }  

MERGE (accountd71d278a8eeb468f9e4517ac1e007fe5)-[:credits]->(transactionba1459c4f7854157be237e7365497fcf) 
MERGE (accountd71d278a8eeb468f9e4517ac1e007fe5)-[:residesWith]->(bank3679799504f54bed9f079848be9c6eff) 
MERGE (transactionba1459c4f7854157be237e7365497fcf)-[:debits]->(account13aa03cd1b6d449e88a3e5e5a22353da) 
MERGE (account13aa03cd1b6d449e88a3e5e5a22353da)-[:residesWith]->(bank522b6b6ed04d40bd9d87d4ecc36fbde2)

Any ideas how I can reduce the time of my query? 有什么想法可以减少查询时间吗?

Before offering any ideas, here is what I tried already: 在提供任何想法之前,这是我已经尝试的方法:

  1. Removing the long names with GUID 使用GUID删除长名称
  2. Remove use of apoc date parse 删除使用Apoc日期解析
  3. Considered using the import from csv in-build functionality but the db is on another server 考虑使用从csv导入的内置功能,但数据库位于另一台服务器上
  4. Combined multiple record queries (and resulted that 2 at once performs best) 合并多个记录查询(导致一次执行2个记录效果最佳)
  5. Created constraints 创建约束

Thanks in advance! 提前致谢!

K ķ

说明

This is list of point to optimize your process : 这是优化流程的要点:

  • Use query parameters : All your query data should be a parameter. 使用查询参数 :所有查询数据都应为参数。 If you do it, Neo4j will not recompute every time the query planner 如果这样做,Neo4j将不会在每次查询计划程序时重新计算
  • Batch your queries : I think you do one transaction for each row of your CSV. 批量查询 :我认为您对CSV的每一行都进行了一笔交易。 Try to bacth your queries (one transaction for 1000 row should be OK, but if your CSV will grow, you will really need more transactions) 尝试处理查询(一次1000行的交易应该可以,但是如果CSV会增长,那么您实际上将需要更多交易)
  • Create one query per node/relationship creation instead of doing one big query, and for the relation use the MATCH MATCH MERGE pattern (you have the constraint, so it will be fast) 每个节点/关系创建创建一个查询,而不是进行一个大查询,对于关系,请使用MATCH MATCH MERGE模式(您有约束,所以会很快)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM