简体   繁体   中英

How to use With clause for Neo4j Cypher subquery formulation?

I trying to create a simple cypher query that should find all instances in the graph matching roughly this structure (BlogPost A) -> (Term) <- (BlogPost B) . This means, I am trying all pairs of blog posts that are flagged with the same term and moreover count the number of terms. A term is a mechanism of categorization in this context.

Here is my query proposal:

MATCH (blogA:content {entitySubType:'blog'}) 
WITH blogA MATCH (blogA) -[]-> (t:term) <-[]- (blogB:content) 
WHERE blogB.entitySubType='blog' AND NOT (ID(blogA) = ID(blogB))  
RETURN ID(blogA), ID(blogB), count(t) ;

This query ends with null after ~1 day.

Is the uasge of blogA in the subquery not possible in the way I am using it? When using the same query with limits I do get reuslts:

MATCH (blogA:content {entitySubType:'blog'}) 
WITH blogA 
LIMIT 10 
MATCH (blogA) -[]-> (t:term) <-[]- (blogB:content) 
WHERE blogB.entitySubType='blog' AND NOT (ID(blogA) = ID(blogB))  
RETURN ID(blogA), ID(blogB), count(t) 
LIMIT 20;

My Neo4j Instance has ~500GB RAM and the whole graph inclduing all properties is ~30 GB with ~15 million vertices in total, whereas there are 101k blog vertices and 108k terms.

I would be grateful for every hint about possible problems or suggestions for improvements.

Also make sure to consume that query with a client driver (eg Java) that can stream the billions of results. Here is a query that would use the compiled runtime which should be fastest and most memory efficient.

MATCH (blogA:Blog)-[:TAGGED]->(t:Term)<-[:TAGGED]-(blogB:Blog)
WHERE blogA <> blogB 
RETURN ID(blogA), ID(blogB), count(t);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM