Currently I have a unique index on node with label "d:ReferenceEntity". It's taking approximately 11 seconds for this query to run, returning 7 rows. Granted T1 has about 400,000 relationships.
I'm not sure why this would take too long, considering we can build a Map of all connected Nodes to T1, thus giving constant time.
Am I missing some other index features that Neo4j can provide? Also my entire dataset is in memory, so it shouldn't have anything with going to disk.
match(n:ReferenceEntity {entityId : "T1" })-[r:HAS_REL]-(d:ReferenceEntity) WHERE d.entityId in ["T2", "T3", "T4"] return n
:schema
Indexes
ON :ReferenceEntity(entityId) ONLINE (for uniqueness constraint)
Constraints
ON (referenceentity:ReferenceEntity) ASSERT referenceentity.entityId IS UNIQUE
Explain Plan:
You had used EXPLAIN
instead of PROFILE
to get that query plan, so it shows misleading estimated row counts. If you had used PROFILE
, then the Expand(All)
operation actually would have had about 400,000 rows, since that operation would actually iterate through every relationship. That is why your query takes so long.
You can try this query, which tells Cypher use the index on d
as well as n
. (On my machine, I had to use the USING INDEX
clause twice to get the desired results.) It definitely pays to use PROFILE
to tune Cypher code.
MATCH (n:ReferenceEntity { entityId : "T1" }) USING INDEX n:ReferenceEntity(entityId) MATCH n-[r:HAS_REL]-(d:ReferenceEntity) USING INDEX d:ReferenceEntity(entityId) WHERE d.entityId IN ["T2", "T3", "T4"] RETURN n, d;
Here is the Profile Plan (In my DB, I had 2 relationships that satisfy the WHERE
test):
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.