简体   繁体   中英

How to combine similar nodes in neo4j

I have defined few nodes and relationships in neo4j graph database but the output is bit different from expected one as each node is representing its own data and attributes. I want combination of same node showcasing different relationships and attributes

`LOAD CSV WITH HEADERS FROM "file:///data.csv" AS line 
CREATE(s:SourceID{Name:line.SourceID})
CREATE(t:Title{Name:line.Title})
CREATE(c:Coverage{Name:line.Coverage})
CREATE(p:Publisher{Name:line.Publisher})
MERGE (p)-[:PUBLISHES]->(t) 
MERGE (p)-[:Coverage{covers:line.Coverage}]->(t)
MERGE (t)-[:BelongsTO]->(p)
MERGE (s)-[:SourceID]->(t)`

在此处输入图像描述

In given picture there are two nodes with Springer Nature and i wish to have only one node namely, Springer Nature and all the associated data of both the nodes to be present in single node.

First of all I would recommend you to set a CONSTRAINT before adding data. It seems that the Nodes can have duplicates when creating them because you are merging patterns and the cypher query does not specify that the nodes have to be identified unique nodes.

So in your case try this first for each of the node labels:

CREATE CONSTRAINT publisherID IF NOT EXISTS FOR (n:Publisher) REQUIRE (n.Name) IS UNIQUE;
CREATE CONSTRAINT sourceID IF NOT EXISTS FOR (n:SourceID) REQUIRE (n.Name) IS UNIQUE;
CREATE CONSTRAINT titleID IF NOT EXISTS FOR (n:Title) REQUIRE (n.Name) IS UNIQUE;
CREATE CONSTRAINT coverageID IF NOT EXISTS FOR (n:Coverage) REQUIRE (n.Name) IS UNIQUE;

Even better would be to not use the name but a publisher ID. But this is your choice and if not thausands of publishers are in the data no issue at all.

Also I would in first place not use CREATE for creating the nodes but use MERGE instead. Because the cypher query goes line by line and if you want to create a node which already exists which could happen on second line or on the 50's line the query would fail if you set the CONSTRAINT above.

And try everything on a blank database for example by deleting all node:

MATCH (n) DETACH DELETE n

So to sum up the Cypher Query in one go or you send the queries separately:

CREATE CONSTRAINT publisherID IF NOT EXISTS FOR (n:Publisher) REQUIRE (n.Name) IS UNIQUE;
CREATE CONSTRAINT sourceID IF NOT EXISTS FOR (n:SourceID) REQUIRE (n.Name) IS UNIQUE;
CREATE CONSTRAINT titleID IF NOT EXISTS FOR (n:Title) REQUIRE (n.Name) IS UNIQUE;
CREATE CONSTRAINT coverageID IF NOT EXISTS FOR (n:Coverage) REQUIRE (n.Name) IS UNIQUE;

LOAD CSV WITH HEADERS FROM "file:///data.csv" AS line 
MERGE(s:SourceID{Name:line.SourceID})
MERGE(t:Title{Name:line.Title})
MERGE(c:Coverage{Name:line.Coverage})
MERGE(p:Publisher{Name:line.Publisher})
MERGE (p)-[:PUBLISHES]->(t) 
MERGE (p)-[:Coverage{covers:line.Coverage}]->(t)
MERGE (t)-[:BelongsTO]->(p)
MERGE (s)-[:SourceID]->(t)
RETURN count(p), count(t), count(c), count(s);

Hope this helps. And please consider voting up this answer or marking it as accepted answer if it helped. Thanks

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM