简体繁体 English

MongoDB - 多对多关系？

[英]MongoDB - Many-to-many relationship?

原文 2015-02-01 20:13:22 1 2 mongodb

I'm curious how one would structure a MongoDB where you have many-to-many relationships, with potentially tens of thousands of records. 我很好奇如何构建一个拥有多对多关系的MongoDB，可能有数万条记录。

Let's say you have a restaurant database that tracks a huge array of restaurants and all the people who have checked into those restaurants. 假设您有一个餐馆数据库，可以跟踪大量的餐馆和所有已经入住这些餐馆的人。 So the user may want to look up a person and see all the restaurants they've checked into, but also look up a restaurant and see all the people who have checked in. 因此，用户可能想要查找一个人并查看他们检查过的所有餐馆，还可以查找餐馆并查看已登记的所有人。

How does one structure this in a way that makes sense and is easy to search and update? 如何以一种有意义且易于搜索和更新的方式构建它？

2 个解决方案

The example you give, in common with most real-world examples of many-to-many relationships, is actually an example of a few-to-few relationship. 您提供的示例与大多数真实世界的多对多关系示例一样，实际上是少数关系的一个示例。 You may have many restaurants and many diners but, compared to the entire set, any given restaurant has only served a small subset of diners and most individual diners will only have visited a small subset of the restaurants. 您可能有许多餐馆和许多用餐者，但与整套餐厅相比，任何一家餐厅只供应一小部分食客，大多数个人用餐者只会访问一小部分餐馆。 It sounds like a sparsely linked network where the link density ratio is significantly below one. 这听起来像一个稀疏链接的网络，其链路密度比明显低于1。

To measure the link density (edge density) of a network, we calculate the ratio of existing links m to the total number of possible links. 为了测量网络的链路密度（边缘密度），我们计算现有链路m与可能链路总数的比率。 For a network of N nodes, the network link density is D = m / 0.5*N*(N-1) The (maximal) link density D of a completely connected network is 1. - Network-Science 对于N个节点的网络，网络链路密度为D = m / 0.5 * N *（N-1）完全连接网络的（最大）链路密度D为1. - 网络 - 科学

However, you asked about many-to-many so how about we use a neural network as the example? 但是，您询问了多对多，那么我们如何以神经网络为例呢？ Neural networks often form dense networks and so represent a true many-to-many network. 神经网络通常形成密集网络，因此代表真正的多对多网络。 In which case the answer is easy - don't use mongoDB. 在这种情况下答案很简单 - 不要使用mongoDB。 Use custom structures and serialisation strategies tailored to your specific requirements. 使用根据您的特定要求定制的自定义结构和序列化策略。 After all, true many-to-many relationships are nearly always outliers and so justify specific treatment. 毕竟，真正的多对多关系几乎总是异常，因此可以证明具体的治疗方法是正确的。

With that said, modelling the more usual few-to-few relationship in mongoDB can be achieved without sacrificing the rich document structure, and how you achieve this depends on your access patterns. 话虽如此，可以在不牺牲丰富的文档结构的情况下实现对mongoDB中更常见的少数关系的建模，以及如何实现这一点取决于您的访问模式。

So, with the restaurant / diner network example, if you are typically going to query a restaurant on its diners then you would create an array of diner_ids held with each restaurant. 因此，通过餐厅/餐馆网络示例，如果您通常要在其食客上查询餐馆，那么您将创建一个与每个餐厅一起保存的diner_ids数组。 The other way would mean an array of restaurant_ids held with each diner. 另一种方式意味着每个用餐者都会举办一系列restaurant_ids。 Both for for two-way query ability. 两者都用于双向查询能力。

Care has to be taken because there is no foreign_key constraint in mongoDB and therefore maintaining your data's referential integrity is your responsibility. 必须小心，因为mongoDB中没有foreign_key约束，因此维护数据的引用完整性是您的责任。

If performance is most important to you then you may wish to embed the data in each document rather than reference it with an id. 如果性能对您来说最重要，那么您可能希望将数据嵌入到每个文档中，而不是使用id引用它。 This is the higher performance option for reading (not so much for writing) as all the data can be pulled off the disk in one hit. 这是用于读取的更高性能选项（不是用于写入），因为所有数据都可以在一次命中中从磁盘中拉出。 It means that you will need to do more work when you update data values to ensure the integrity of your data, but often this is not as scary as it first seems. 这意味着在更新数据值时需要做更多工作以确保数据的完整性，但通常这并不像它最初看起来那么可怕。 How often do the diners really change their names? 用餐者经常改变名字的频率是多少？ And depending on the document sizes, you may not necessarily want to embed the full document, a subset of the data plus an id to point to the full record will often do the trick. 根据文档大小的不同，您可能不一定要嵌入完整的文档，数据的子集加上指向完整记录的id通常会起作用。

In short, mongoDB schema design should be driven by the application requirements. 简而言之，mongoDB架构设计应该由应用程序要求驱动。 Different schemas for different applications as opposed to one monolithic relational DB to rule them all. 不同应用程序的不同模式，而不是一个单片关系数据库来统治它们。 What is the reality of the data? 数据的实际情况是什么？ How does the application actually use this data? 应用程序如何实际使用此数据？ How big are the document objects being stored? 文件对象的存储量有多大？ Answer these questions and your schema will practically design itself. 回答这些问题，您的架构将实际设计自己。

I would create a checkins or visits collection. 我会创建一个checkins或visits集合。 When a user visits that restaurant, a new document is created which references both the user and the restaurant. 当用户访问该餐馆时，创建一个引用用户和餐馆的新文档。 This is fairly clean and straightforward 这是相当干净和直接的