简体   繁体   English

优化密码查询-Neo4j

[英]Optimizing Cypher Query - Neo4j

I have the following query 我有以下查询

MATCH (User1 )-[:VIEWED]->(page)<-[:VIEWED]- (User2 ) 匹配(User1)-[:VIEWED]->(页面)<-[:VIEWED]-(User2)

RETURN User1.userId,User2.userId, count(page) as cnt 返回User1.userId,User2.userId,count(page)作为cnt

Its a relatively simple query to find co-page view counts between users. 它是一个相对简单的查询,用于查找用户之间的页面浏览量。 Its just too slow, and I have to terminate it after some time. 它太慢了,我必须在一段时间后终止它。

Details 细节

User consists of about 150k Nodes Page consists of about 180k Nodes 用户包括大约15万个节点页面包含大约18万个节点

User -VIEWS-> Page has about 380k Relationships 用户-VIEWS->页面具有约380k关系

User has 7 attributes, and Page has about 5 attributes. 用户具有7个属性,而Page具有大约5个属性。

Both User and Page are indexed on UserId and PageId respectively. 用户和页面都分别在UserId和PageId上建立索引。

Heap Size is 512mb (tried to run on 1g too) 堆大小为512mb(也尝试在1g上运行)

What would be some of the ways to optimize this query as I think the count of the nodes and relationships are not a lot. 由于我认为节点和关系的数量不是很多,因此有什么方法可以优化此查询。

Use Labels 使用标签

Always use Node labels in your patterns. 始终在模式中使用Node标签

MATCH (u1:User)-[:VIEWED]->(p:Page)<-[:VIEWED]-(u2:User)
RETURN u1.userId, u2.userId, count(p) AS cnt;

Don't match on duplicate pairs of users 在重复的用户对上不匹配

This query will be executed for all pairs of users (that share a viewed page) twice. 该查询将对所有对(共享一个查看页面)的用户对执行两次。 Each user will be mapped to User1 and then each user will also be mapped to User2 . 每个用户将被映射到User1 ,然后每个用户还将被映射到User2 To limit this: 要限制此:

MATCH (u1:User)-[:VIEWED]->(p:Page)<-[:VIEWED]-(u2:User)
WHERE id(u1) > id(u2)
RETURN u1.userId, u2.userId, count(p) AS cnt;

Query for a specific user 查询特定用户

If you can bind either side of the pattern the query will be much faster. 如果您可以绑定模式的任何一侧,查询将更快。 Do you need to execute this query for all pairs of users? 您是否需要对所有用户对执行此查询? Would it make sense to execute it relative to a single user only? 仅相对于单个用户执行它是否有意义? For example: 例如:

MATCH (u1:User {name: "Bob"})-[:VIEWED]->(p:Page)<-[:VIEWED]-(u2:User)
WHERE NOT u1=u2
RETURN u1.userId, u2.userId, count(p) AS cnt;

As you are trying different queries you can prepend EXPLAIN or PROFILE to the Cypher query to see the execution plan and number of data hits. 在尝试其他查询时,可以在EXP查询前添加EXPLAINPROFILE以查看执行计划和数据命中数。 More info here. 更多信息在这里。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM