簡體   English   中英

在 Neo4j 中加載 CSV 很耗時

[英]Loading CSV in Neo4j is time consuming

我想將包含 648000 條記錄的 CDR csv 文件加載到 neo4j(4.4.10),但大約需要 4 天,而且還沒有完成。

我的 CSV 有 7 列的 648000 條記錄。 文件大小約為 48 MB。 我的電腦有 100 GB RAM 和英特爾 Zeon E5 CPU。

CSV 的列是:

OP_名稱 TP_Name 被叫號碼 OP_ANI 設置時間 期間 OP_Price

我用來在 Neo4j 中加載 CSV 的代碼是:

```Cypher
:auto load csv with headers from 'file:///cdr.csv' as line FIELDTERMINATOR  ','
    with line
    where line['Called_Number'] is not null and line['OP_ANI'] is not null
    with line['OP_ANI'] as OP_Phone,
        (CASE line['OP_Name']
            WHEN 'TIC' THEN 'IRAN'
            ELSE 'Foreign' END) AS OP_country,
        line['Called_Number'] as Called_Phone,
        (CASE line['TP_Name']
            WHEN 'TIC' THEN 'IRAN'
            ELSE 'Foreign' END) AS TP_country,
        line['Setup_Time'] as Setup_Time, 
        line['Duration'] as Duration, 
        line['OP_Price'] as OP_Price
    
    call {
        with  OP_Phone, OP_country, Called_Phone, TP_country, Setup_Time, Duration, OP_Price
        
MERGE (c:Customer{phone: toInteger(Called_Phone)})
            on create set c.country = TP_country
            WITH c, OP_Phone, OP_country, Called_Phone, TP_country, Setup_Time, Duration, OP_Price
            CALL apoc.create.addLabels( c, [ c.country ] ) YIELD node
        
MERGE (c2:Customer{phone: toInteger(OP_Phone)})
            on create set c2.country = OP_country
            WITH c2, OP_Phone, OP_country, Called_Phone, TP_country, Setup_Time, Duration, OP_Price, c
            CALL apoc.create.addLabels( c2, [ c2.country ] )  YIELD node
        
        MERGE (c2)-[r:CALLED{setupTime: Setup_Time, 
                    duration: Duration,
                    OP_Price: OP_Price}]->(c)
       
    } IN TRANSACTIONS

```

如何加快加載操作?

MERGE在 Neo4j 中充當 upsert。 所以聲明:

MERGE (c:Customer{phone: toInteger(Called_Phone)})

檢查是否存在具有給定電話號碼的Customer節點。 如果是,則執行更新,否則創建節點。 當有大量節點時,這種查找可能會很慢,CSV 導入總體上會很慢。 Customerphone屬性上創建索引應該可以解決問題。 您可以像這樣創建索引:

CREATE INDEX phone IF NOT EXISTS FOR (n:Customer) ON (n.phone)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM