[英]How do I load nodes and edges from different files in neo4j using cypher efficiently?
Say I have a csv file containing node information, each line with a unique id (the first column), and another csv file containing the edges, describing edges between the nodes (via their unique ID's). 假设我有一个包含节点信息的csv文件,每行具有唯一的id(第一列),另一个csv文件包含边缘,它们描述了节点之间的边缘(通过其唯一ID)。 The following cypher code successfully loads the nodes and then creates the edges. 以下密码代码成功加载了节点,然后创建了边。 However, can I make it more efficient? 但是,我可以提高效率吗? My real data set has millions of nodes and tens of millions of edges. 我的真实数据集具有数百万个节点和数千万条边。 Obviously I should use periodic commits and create an index, but can I somehow avoid match
ing for every single edge and use the fact that I know of the unique node ids for each edge I want to build? 显然,我应该使用定期提交并创建索引,但是我可以以某种方式避免对每个单个边缘进行match
,并利用我知道要构建的每个边缘的唯一节点ID的事实吗? Or am I going about this all wrong? 还是我要解决所有这些错误? I would like to do this entirely in cypher (no java). 我想完全在cypher(没有java)中做到这一点。
load csv from 'file:///home/user/nodes.txt' as line
create (:foo { id: toInt(line[0]), name: line[1], someprop: line[2]});
load csv from 'file:///home/user/edges.txt' as line
match (n1:foo { id: toInt(line[0])} )
with n1, line
match (n2:foo { id: toInt(line[1])} )
// if I had an index I'd use it here with: using index n2:foo(name)
merge (n1) -[:bar]-> (n2) ;
match p = (n)-->(m) return p;
nodes.txt
: nodes.txt
:
0,node0,Some Property 0
1,node1,Some Property 1
2,node2,Some Property 2
3,node3,Some Property 3
4,node4,Some Property 4
5,node5,Some Property 5
6,node6,Some Property 6
7,node7,Some Property 7
8,node8,Some Property 8
9,node9,Some Property 9
10,node10,Some Property 10
...
edges.txt
: edges.txt
:
0,2
0,4
0,8
0,13
1,4
1,8
1,15
2,4
2,6
3,4
3,7
3,8
3,11
4,10
...
Like Ron commented above, LOAD CSV is likely not the way to go for large datasets, and the csv Batch Import tool he links to is great. 就像Ron上面评论的那样,对于大型数据集,LOAD CSV可能不是可行的方法,他链接到的csv Batch Import工具也很棒。 If you find you cannot wedge a csv easily in a way that works with the Batch Import tool, then the Neo4J BatchInserter API is very simply to use: http://docs.neo4j.org/chunked/stable/batchinsert.html 如果您发现无法以与批处理导入工具一起使用的方式轻松插入csv,则Neo4J BatchInserter API的使用非常简单: http ://docs.neo4j.org/chunked/stable/batchinsert.html
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.