简体   繁体   English

我可以使用此CSV加载带有密码的neo4j图吗?

[英]Can I use this CSV to load a neo4j graph with cypher?

I am a medical doctor trying to model a drugs to enzymes database and am starting with a CSV file I use to load my data into the Gephi graph layouting program. 我是一名试图为药物建立酶数据库模型的医生,并且从CSV文件开始,该文件用于将数据加载到Gephi图形布局程序中。 I understand the power of a graph db but am illiterate with cypher: 我了解图数据库的功能,但是对cypher并不了解:

The current CSV has the following format: 当前的CSV具有以下格式:

source;target;arc_type; <- this is an header needed for Gephi import
artemisinin;2B6;induces;
...
amiodarone;1A2;represses;
...
3A457;carbamazepine;metabolizes;

These sample records show the three types of relationships. 这些样本记录显示了三种类型的关系。 Drugs can repress or augment a cytochrome, and cytochromes metabolize drugs. 药物可以抑制或增加细胞色素,并且细胞色素可以代谢药物。

Is there a way to use this CSV as is to load into neo4j and create the graph? 有没有办法使用此CSV加载到neo4j并创建图形的方法?

Thank you very much. 非常感谢你。

In neo4j terminology, a relationship must have "type", and a node can have any number of labels . 用neo4j术语, 关系必须具有“类型”,并且节点可以具有任意数量的标签 It looks like your use case could benefit from labelling your nodes with either Drug or Cytochrome . 看起来您的用例可以从用DrugCytochrome标记节点中受益。

Here is a possible neo4j data model for your use case: 这是您的用例可能的neo4j数据模型:

(:Drug)-[:MODULATES {induces: false}]->(:Cytochrome)
(:Cytochrome)-[:METABOLIZES]->(:Drug)

The induces property has a boolean value indicating whether a drug induces (true) or represses (false) the related cythochrome. induces属性具有布尔值,指示药物是诱导(真)还是抑制(假)相关细胞色素。

The following is a (somewhat complex) query that generates the above data model from your CSV file: 以下是一个(有点复杂)查询,它从CSV文件生成上述数据模型:

USING PERIODIC COMMIT 500
LOAD CSV WITH HEADERS FROM 'file:///Drugs.csv' AS line FIELDTERMINATOR ';'
WITH line,
  CASE line.arc_type
    WHEN 'metabolizes' THEN {a: [1]}
    WHEN 'induces' THEN {b: [true]}
    ELSE {b: [false]}
  END AS todo
FOREACH (ignored IN todo.a |
  MERGE (c:Cytochrome {id: line.source})
  MERGE (d:Drug {id: line.target})
  MERGE (c)-[:METABOLIZES]->(d)
)
FOREACH (induces IN todo.b |
  MERGE (d:Drug {id: line.source})
  MERGE (c:Cytochrome {id: line.target})
  MERGE (d)-[:MODULATES {induces: induces}]->(c)
)

The FOREACH clause does nothing if the value after the IN is null. 如果IN后面的值为空,则FOREACH子句不执行任何操作。

Yes it's possible, but you will need to install APOC : a list of usefull stored procedures for Neo4j. 是的,有可能,但是您需要安装APOC:Neo4j的有用存储过程的列表。 You can find it here : https://neo4j-contrib.github.io/neo4j-apoc-procedures/ 您可以在这里找到它: https : //neo4j-contrib.github.io/neo4j-apoc-procedures/

Then you should put your CSV file into the import folder of Neo4j, and run those queries : 然后,您应该将CSV文件放入Neo4j的import文件夹中,然后运行以下查询:

The first one to create a unique constraint on :Node(name) : 第一个在:Node(name)上创建唯一约束:

CREATE CONSTRAINT ON (n:Node) ASSERT n.name IS UNIQUE;

And then this query to import your data : 然后此查询导入您的数据:

USING PERIODIC COMMIT 500
LOAD CSV WITH HEADERS FROM 'file:///my-csv-file.csv' AS line
  MERGE (n:Node {name:line.source})
  MERGE (m:Node {name:line.target})
  CALL apoc.create.relationship(n, line.arc_type,{​}, m)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM