简体   繁体   中英

MongoDB - Many-to-many relationship?

I'm curious how one would structure a MongoDB where you have many-to-many relationships, with potentially tens of thousands of records.

Let's say you have a restaurant database that tracks a huge array of restaurants and all the people who have checked into those restaurants. So the user may want to look up a person and see all the restaurants they've checked into, but also look up a restaurant and see all the people who have checked in.

How does one structure this in a way that makes sense and is easy to search and update?

The example you give, in common with most real-world examples of many-to-many relationships, is actually an example of a few-to-few relationship. You may have many restaurants and many diners but, compared to the entire set, any given restaurant has only served a small subset of diners and most individual diners will only have visited a small subset of the restaurants. It sounds like a sparsely linked network where the link density ratio is significantly below one.

To measure the link density (edge density) of a network, we calculate the ratio of existing links m to the total number of possible links. For a network of N nodes, the network link density is D = m / 0.5*N*(N-1) The (maximal) link density D of a completely connected network is 1. - Network-Science

However, you asked about many-to-many so how about we use a neural network as the example? Neural networks often form dense networks and so represent a true many-to-many network. In which case the answer is easy - don't use mongoDB. Use custom structures and serialisation strategies tailored to your specific requirements. After all, true many-to-many relationships are nearly always outliers and so justify specific treatment.

With that said, modelling the more usual few-to-few relationship in mongoDB can be achieved without sacrificing the rich document structure, and how you achieve this depends on your access patterns.

So, with the restaurant / diner network example, if you are typically going to query a restaurant on its diners then you would create an array of diner_ids held with each restaurant. The other way would mean an array of restaurant_ids held with each diner. Both for for two-way query ability.

Care has to be taken because there is no foreign_key constraint in mongoDB and therefore maintaining your data's referential integrity is your responsibility.

If performance is most important to you then you may wish to embed the data in each document rather than reference it with an id. This is the higher performance option for reading (not so much for writing) as all the data can be pulled off the disk in one hit. It means that you will need to do more work when you update data values to ensure the integrity of your data, but often this is not as scary as it first seems. How often do the diners really change their names? And depending on the document sizes, you may not necessarily want to embed the full document, a subset of the data plus an id to point to the full record will often do the trick.

In short, mongoDB schema design should be driven by the application requirements. Different schemas for different applications as opposed to one monolithic relational DB to rule them all. What is the reality of the data? How does the application actually use this data? How big are the document objects being stored? Answer these questions and your schema will practically design itself.

I would create a checkins or visits collection. When a user visits that restaurant, a new document is created which references both the user and the restaurant. This is fairly clean and straightforward

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM