简体   繁体   中英

Can I use this CSV to load a neo4j graph with cypher?

I am a medical doctor trying to model a drugs to enzymes database and am starting with a CSV file I use to load my data into the Gephi graph layouting program. I understand the power of a graph db but am illiterate with cypher:

The current CSV has the following format:

source;target;arc_type; <- this is an header needed for Gephi import
artemisinin;2B6;induces;
...
amiodarone;1A2;represses;
...
3A457;carbamazepine;metabolizes;

These sample records show the three types of relationships. Drugs can repress or augment a cytochrome, and cytochromes metabolize drugs.

Is there a way to use this CSV as is to load into neo4j and create the graph?

Thank you very much.

In neo4j terminology, a relationship must have "type", and a node can have any number of labels . It looks like your use case could benefit from labelling your nodes with either Drug or Cytochrome .

Here is a possible neo4j data model for your use case:

(:Drug)-[:MODULATES {induces: false}]->(:Cytochrome)
(:Cytochrome)-[:METABOLIZES]->(:Drug)

The induces property has a boolean value indicating whether a drug induces (true) or represses (false) the related cythochrome.

The following is a (somewhat complex) query that generates the above data model from your CSV file:

USING PERIODIC COMMIT 500
LOAD CSV WITH HEADERS FROM 'file:///Drugs.csv' AS line FIELDTERMINATOR ';'
WITH line,
  CASE line.arc_type
    WHEN 'metabolizes' THEN {a: [1]}
    WHEN 'induces' THEN {b: [true]}
    ELSE {b: [false]}
  END AS todo
FOREACH (ignored IN todo.a |
  MERGE (c:Cytochrome {id: line.source})
  MERGE (d:Drug {id: line.target})
  MERGE (c)-[:METABOLIZES]->(d)
)
FOREACH (induces IN todo.b |
  MERGE (d:Drug {id: line.source})
  MERGE (c:Cytochrome {id: line.target})
  MERGE (d)-[:MODULATES {induces: induces}]->(c)
)

The FOREACH clause does nothing if the value after the IN is null.

Yes it's possible, but you will need to install APOC : a list of usefull stored procedures for Neo4j. You can find it here : https://neo4j-contrib.github.io/neo4j-apoc-procedures/

Then you should put your CSV file into the import folder of Neo4j, and run those queries :

The first one to create a unique constraint on :Node(name) :

CREATE CONSTRAINT ON (n:Node) ASSERT n.name IS UNIQUE;

And then this query to import your data :

USING PERIODIC COMMIT 500
LOAD CSV WITH HEADERS FROM 'file:///my-csv-file.csv' AS line
  MERGE (n:Node {name:line.source})
  MERGE (m:Node {name:line.target})
  CALL apoc.create.relationship(n, line.arc_type,{​}, m)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM