Neo4j building initial graph is slow

Question

I am trying to build out a social graph between 100k users. Users can sync other social media platforms or upload their own contacts. Building each relationship takes about 200ms. Currently, I have everything uploaded on a queue so it can run in the background, but ideally, I can complete it within the HTTP request window. I've tried a few things and received a few warnings.

Added an index to the field pn
Getting a warning This query builds a cartesian product between disconnected patterns. - I understand why I am getting this warning, but no relationship exists and that's what I am building in this initial call.

MATCH (p1:Person {userId: "....."}), (p2:Person) WHERE p2.pn = "....." MERGE (p1)-[:REL]->(p2) RETURN p1, p2

Any advice on how to make it faster? Ideally, each relationship creation is around 1-2ms.

Answer 1

You may want to EXPLAIN the query and make sure that NodeIndexSeeks are being used, and not NodeByLabelScan. You also mentioned an index on:Person(pn), but you have a lookup on:Person(userId), so you might be missing an index there, unless that was a typo.

Regarding the cartesian product warning, disregard it, the cartesian product is necessary in order to get the nodes to create the relationship, this should be a 1 x 1 = 1 row operation so it's only going to be costly if multiple nodes are being matched per side, or if index lookups aren't being used.

If these are part of some batch load operation, then you may want to make your query apply in batches. So if 100 contacts are being loaded by a user, you do NOT want to execute 100 queries each, with each query adding a single contact. Instead, pass as a parameter the list of contacts, then UNWIND the list and apply the query once to process the entire batch.

Something like:

UNWIND $batch as row
MATCH (p1:Person {pn: row.p1}), (p2:Person {pn: row.p2) 
MERGE (p1)-[:REL]->(p2) 
RETURN p1, p2

It's usually okay to batch 10k or so entries at a time, though you can adjust that depending on the complexity of the query

Check out this blog entry for how to apply this approach.

https://dzone.com/articles/tips-for-fast-batch-updates-of-graph-structures-wi

Answer 2

You can use the index you created on Person by suggesting a planner hint. Reference: https://neo4j.com/docs/cypher-manual/current/query-tuning/using/#query-using-index-hint

CREATE INDEX ON :Person(pn);

MATCH (p1:Person {userId: "....."}) 
WITH p1
MATCH (p2:Person) using index p2:Person(pn)
WHERE p2.pn = "....."
MERGE (p1)-[:REL]->(p2) 
RETURN p1, p2

Neo4j building initial graph is slow

Question

2 answers

solution1
1 ACCPTED 2021-02-22 23:35:00

solution2
0 2021-02-22 23:41:49

Neo4j building initial graph is slow

Question

2 answers

solution1 1 ACCPTED 2021-02-22 23:35:00

solution2 0 2021-02-22 23:41:49

solution1
1 ACCPTED 2021-02-22 23:35:00

solution2
0 2021-02-22 23:41:49