I've recently started sketching up a personal project which will involve a social network side. I have some professional experience with Neo4j and while it feels like a perfect match there is one query that concerns me.
Imagine a general social network: users follow each other, users post posts, users can see the posts written by the users they're following. This is cleanly expressed in Neo4j through :User
and :Post
labelled nodes, connected through :posted
and :follows
relationships.
So I could get the posts by users I follow using a query like:
MATCH (:User {user_id: 1})-[:follows]->(:User)-[:posted]->(p:Post)
RETURN p
This is pretty clean and simple. My concern is that realistically I will want to get the most recent 10 posts, and then the 10 posts after that, and so on.
So I created an index on a created_at
field in :Post
nodes and added an ORDER BY p.created_at DESC
clause to the query. I thought this would allow me to sort them efficiently however running an EXPLAIN
on this query shows that ORDER BY
clauses do not, for the most part, use indexes to speed up this process. As such I'm unsure if there's a way to get these efficiently when the result set becomes significantly large.
This may be inexperience or just approaching this data model incorrectly. Can I get some input on this kind of problem? Should I model my data differently? Is my query/index wrong? Is there something I'm missing? How would you do this?
EDIT 1: Example query for something like what I meant:
MATCH (:User {user_id: 1})-[:follows]->(:User)-[:posted]->(p:Post)
RETURN p
ORDER BY p.created_at DESC
LIMIT 10
Also I've been thinking that using a range (in a WHERE
clause) is a possibility to limit the result set size but still unsure of whether there's a better way?
EDIT 2 (Solution): This was the final query that made the Cypher planner use the index for this problem:
MATCH (:User {user_id: 1})-[:follows]->(:User)-[:posted]->(p:Post)
USING INDEX p:Post(created_at)
WHERE p.created_at < datetime()
RETURN p
ORDER BY p.created_at DESC
LIMIT 10
Neo4j 3.5 introduced some support for using indexes to perform ORDER BY
operations , with some restrictions.
But, currently (in neo4j 3.5.3), even when the usage of an index is supported for ORDER BY
, the Cypher planner does not seem to automatically use it for that purpose. In my experimentation with version 3.5.3, I found that if you do not use the index in a WHERE
clause then the planner will not use the index at all.
So, as a simple workaround, you can just add a trivial WHERE
clause using the index. For example, here is a modified version of your query that will "trick" the planner into using the index for ORDER BY
:
MATCH (:User {user_id: 1})-[:follows]->(:User)-[:posted]->(p:Post)
WHERE p.created_at > 0
RETURN p
ORDER BY p.created_at DESC
LIMIT 10
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.