I am a medical doctor trying to model a drugs to enzymes database and am starting with a CSV file I use to load my data into the Gephi graph layouting program. I understand the power of a graph db but am illiterate with cypher:
The current CSV has the following format:
source;target;arc_type; <- this is an header needed for Gephi import
artemisinin;2B6;induces;
...
amiodarone;1A2;represses;
...
3A457;carbamazepine;metabolizes;
These sample records show the three types of relationships. Drugs can repress or augment a cytochrome, and cytochromes metabolize drugs.
Is there a way to use this CSV as is to load into neo4j and create the graph?
Thank you very much.
In neo4j terminology, a relationship must have "type", and a node can have any number of labels . It looks like your use case could benefit from labelling your nodes with either Drug
or Cytochrome
.
Here is a possible neo4j data model for your use case:
(:Drug)-[:MODULATES {induces: false}]->(:Cytochrome)
(:Cytochrome)-[:METABOLIZES]->(:Drug)
The induces
property has a boolean value indicating whether a drug induces (true) or represses (false) the related cythochrome.
The following is a (somewhat complex) query that generates the above data model from your CSV file:
USING PERIODIC COMMIT 500
LOAD CSV WITH HEADERS FROM 'file:///Drugs.csv' AS line FIELDTERMINATOR ';'
WITH line,
CASE line.arc_type
WHEN 'metabolizes' THEN {a: [1]}
WHEN 'induces' THEN {b: [true]}
ELSE {b: [false]}
END AS todo
FOREACH (ignored IN todo.a |
MERGE (c:Cytochrome {id: line.source})
MERGE (d:Drug {id: line.target})
MERGE (c)-[:METABOLIZES]->(d)
)
FOREACH (induces IN todo.b |
MERGE (d:Drug {id: line.source})
MERGE (c:Cytochrome {id: line.target})
MERGE (d)-[:MODULATES {induces: induces}]->(c)
)
The FOREACH clause does nothing if the value after the IN
is null.
Yes it's possible, but you will need to install APOC : a list of usefull stored procedures for Neo4j. You can find it here : https://neo4j-contrib.github.io/neo4j-apoc-procedures/
Then you should put your CSV file into the import
folder of Neo4j, and run those queries :
The first one to create a unique constraint on :Node(name)
:
CREATE CONSTRAINT ON (n:Node) ASSERT n.name IS UNIQUE;
And then this query to import your data :
USING PERIODIC COMMIT 500
LOAD CSV WITH HEADERS FROM 'file:///my-csv-file.csv' AS line
MERGE (n:Node {name:line.source})
MERGE (m:Node {name:line.target})
CALL apoc.create.relationship(n, line.arc_type,{}, m)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.