简体   繁体   中英

Query and Sort in MongoDB for a many-to-many relationship

Suppose I have a relationship between users , posts , likes . A user can like a post and a post can be liked by many users.

My goal is to be able to design a db structure in MongoDB such that I can quickly query for all the posts a user has liked AND sort/filter them in the multiple ways listed below (not at the same time - think a dropdown that lets you change the sort order of your search results)

  1. Order in which posts were liked
  2. Filter and order by various post attributes - such as title, number of post responses, when the post was created, etc

Suppose the number of posts is in the order of 100,000 and each post will have on the order of 100-1000 likes

Possible solutions I've thought of:

1) likes are embedded in posts .

This allows #2 to be dealt with easily because you just have an index over likes.user_id and over whatever other post attributes you need. This is also fast, because you only need to run one query.

However, this makes it impossible to sort by when a user liked something (AFAIK).

2) likes are a separate collection with attributes post_id , account_id .

This allows #1 to be dealt with easily since you can just sort by _id. However, unless you duplicate & cache post attributes into the like document, it becomes impossible to handle #2. This is possible but really not ideal. Additionally, this is slower to query. You'd need to run two queries - one to query the like collection, then a post query using $in: [post_ids].

Are there any other solutions/designs I should consider? Am I missing anything in these proposed solutions?

I would use a denormalized version of #2. Have a like document:

{
    "_id" : ObjectId(...),
    "account_id" : 1234,
    "post_id" : 4321,
    "ts" : ISODate(...),
    // additional info about post needed for basic display
    "post_title" : "The 10 Worst-Kept Secrets of Cheesemongers"
    // etc.
}

With an index on { "account_id" : 1, "ts" : 1 } , you can efficiently find like documents for a specific user ordered by like time.

db.likes.find({ "account_id" : 1234 }).sort({ "ts" : -1 })

If you put the basic info about the post into the like document, you don't need to retrieve the post document until, say, the user clicks on a link to be shown the entire post.

The tradeoff is that, if some like -embedded information about a post changes, it needs to be changed in every like . This could be nothing or it could be cumbersome, depending on what you choose to embed and how often posts are modified after they have a lot of likes.

Your first option seems quite good to me. It deals with both of your requirements nicely. as,

  1. You need to sort the comments, posts based on attributes of post,comment which is possible to through aggregations
  2. You need to filter the documents(posts) based on some attributes which is also possible.

Disadvantage of 2 collections are you need to run 2 queries for getting a piece of data. NoSQL databases gives you flexibility to store related data at one place and provides best performance for the same. By not using benefits of NoSQL you wont achieve optimized performance.

Do not think from RDBMS perspective (forget normalization). If you need more performance optimization with first option go with indexing, sharding (with shard key as alphabets range, geography etc.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM