简体   繁体   English

One-to-Many Mongoose关系 - 存储引用的位置

[英]One-to-Many Mongoose relationship - Where to store the reference

I'm designing the MongoDB collection architecture for a new project, and being new to MongoDB, I had a question about a one-to-many relationship. 我正在为一个新项目设计MongoDB集合架构,并且是MongoDB的新手,我有一个关于一对多关系的问题。

For the sake of this example, lets say the relationship is Datacenter-to-Servers, meaning one Datacenter can have multiple servers (thousands, not limited in the app), and the servers can only belong to one Datacenter. 为了这个例子,假设关系是数据中心到服务器,这意味着一个数据中心可以有多个服务器(数千个,不限于应用程序),服务器只能属于一个数据中心。

Would it be best to have Servers._datacenter referencing the Datacenter._id ? Servers._datacenter引用Datacenter._id最好吗? Or a Datacenter.servers array to store the server ID's? 或者Datacenter.servers 数组来存储服务器ID?

If you suggest having an array in the Datacenter documents to reference what server ID's are associated to it... Then is there a way to find out what Datacenter a server belongs to when you just have the server ID? 如果您建议在Datacenter文档中使用一个数组来引用与之关联的服务器ID ...那么,当您拥有服务器ID时,有没有办法找出服务器所属的Datacenter? (Kinda like a quick where serverId in Datacenter.servers query) without having to query every Datacenter, and then check for the ID in every Datacenter.servers array (有点像where serverId in Datacenter.servers查询where serverId in Datacenter.servers快速查询),无需查询每个数据中心,然后检查每个Datacenter.servers数组中的ID

If you suggest having an element in the Servers documents to reference what Datacenter it belongs to, then is there a way to query for the Datacenter, and return all of the associated Server documents inside a virtual Documents.servers array or something? 如果您建议在Servers文档中有一个元素来引用它所属的Datacenter,那么有没有办法查询Datacenter,并返回虚拟Documents.servers数组中的所有相关Server文档?

Im not quite sure what the best route to take is, since there can be a very very large amount of servers for each datacenter, I think it may be a better idea to not have such a large array inside each Datacenter document... But then if I set it up so that each Server document has the parent Datacenter referenced in it, that makes queries rather difficult (Or not? Maybe theres a very easy way I just haven't discovered, I did say I'm new to Mongo) 我不太清楚最好的路线是什么,因为每个数据中心可能有非常大量的服务器,我认为在每个数据中心文档中没有这么大的数组可能更好......但是然后,如果我设置它,以便每个服务器文档都有其中引用的父数据中心,这使得查询相当困难(或者不是?也许这是一个非常简单的方法,我只是没有发现,我说我是Mongo的新手)

I was reading through this document , and it shows how set the reference direction up either way, and it states: 我正在阅读这篇文章 ,它展示了如何设置参考方向,并指出:

To avoid mutable, growing arrays, store the publisher reference inside the book document 要避免可变的增长数组,请将发布者引用存储在book文档中

So that makes me think it would be best to reference the Datacenter ID's in the Server documents.. So if thats the case, is there a way to return all the server documents as an array inside the Datacenter documents? 所以这让我觉得最好在服务器文档中引用数据中心ID。那么如果是这样的话,有没有办法将所有服务器文档作为数组内部文档中的数组返回? Or would I have to query for the Datacenter, then query for all the Servers with that Datacenter._id, then return a merged object.. 或者我是否必须查询数据中心,然后使用该Datacenter._id查询所有服务器,然后返回合并对象。

It would depend on the access pattern. 这取决于访问模式。 How you are planning to code this as null1941 said. 你如何计划编码为null1941说。

If the number of servers are in 10s or hundreds I guess that would be a one to few relationship instead of one to many so you could go ahead and embed datacenters inside of servers. 如果服务器的数量是10或数百,我猜这将是一对一的关系,而不是一对多,所以你可以继续将数据中心嵌入到服务器中。 This means you will be getting all the information you need in one go and a single query. 这意味着您将获得一次性和单个查询所需的所有信息。 This approach could work if you can guarantee consistency but you will end up having duplication given many servers exist in one datacenter. 如果您可以保证一致性,这种方法可以工作,但如果在一个数据中心中存在许多服务器,您将最终得到重复。 So datacenter document could be duplicated in many server document. 因此数据中心文档可以在许多服务器文档中复制。 This approach can work if again you can guarantee consistency and that datacenters maybe have few information on them. 如果您可以再次保证一致性并且数据中心可能几乎没有关于它们的信息,则此方法可以起作用。 The only advantage with this approach is that you only be doing one query. 这种方法的唯一优势是您只进行一次查询。 Generally this approach is not recommended ; 通常不推荐这种方法; also if you want to treat datacenter as a separate document such that you want to run some operations on it than avoid this approach. 此外,如果您希望将数据中心视为单独的文档,以便您希望在其上运行某些操作,而不是避免使用此方法。

if you decided to go for this approach; 如果你决定采用这种方法; To embed datacenter as an array You can use $all or $in to search inside the array. 将数据中心嵌入为数组您可以使用$ all$ in在数组内部进行搜索。

example: 例:

{
"_id" : ObjectId("63546464sad65s4ad3654"),
"name" : "Server1",
"datacenter" : ["gamma", "500"]          

}

query: 查询:

db.users.find({ "datacenter": { $in: [ "gamma", "delta" ] } } )

if you decided to embed servers as a document (you can embed datacenter document as well inside servers both can work). 如果你决定将服务器作为文档嵌入(你可以嵌入数据中心文档以及内部服务器都可以工作)。 So for embedding servers inside of datacenter document, you could search inside the embedded document using the dot notation. 因此,对于在数据中心文档中嵌入服务器,您可以使用点表示法在嵌入文档内部进行搜索。 example:(servers is the dictionary, name is an attribute inside of servers): 示例:(服务器是字典,name是服务器内的属性):

{
"_id" : ObjectId("63546464sad65s4ad3654"),
"name" : "gamma",
"servers" : [
            {
              "title" : "server1",
              "speed" : "3.2GHZ",
              "ram"   : "200GB"
            },
            {
              "title" : "server2",
              "speed" : "3.2GHZ",
              "ram"   : "64GB"
            }
         ]
}

query: 查询:

db.datacenters.find( { "servers.title": "server1" } 

Again you judge. 你再次判断。 However you decide to do it there is a way in mongodb to retrieve the information you need. 但是,您决定这样做,mongodb中有一种方法可以检索您需要的信息。

Now keep in mind that if you decided to go for embedding servers inside of datacenter document that in mongodb a single document should not exceed 16MB. 现在请记住,如果您决定在数据中心文档中嵌入服务器,那么在mongodb中,单个文档不应超过16MB。 If by embedding this size could be exceeded you should go splitting approach (below). 如果通过嵌入这个大小可能会超过你应该去分裂方法(下面)。

Now the better approach for your is the case which is not embedding; 现在, 更好的方法嵌入的情况; basically as gnerkus said. 基本上就像gnerkus说的那样。 However keep in mind that there is no foreign key constraints in mongodb you have to ensure consistency using the application. 但请记住,在mongodb中没有外键约束,您必须确保使用该应用程序的一致性。 Such that server_id in datacenter collection could be found in server collection (and vise versa). 这样数据中心集合中的server_id可以在服务器集合中找到(反之亦然)。 You could also put datacenter_id inside of server collection; 您还可以将datacenter_id放在服务器集合中; my way of deciding which one to go for is my use case. 我决定选择哪一个的方法是我的用例。 For example if most of my operations are on datacenters I will add server_id to it. 例如,如果我的大部分操作都在数据中心上,我将向其添加server_id。 if most of my operations are on the server collection I will add datacenter_id to it. 如果我的大多数操作都在服务器集合上,我将向其添加datacenter_id。 In both cases you will be doing two or more queries. 在这两种情况下,您将进行两次或更多次查询。 Here is an example: 这是一个例子:

Datacenter document example 数据中心文档示例

 {
     _id : ObjectId("10001000010000"),
     name : 'Gamma',        
     location: 'pluto',
     servers: [      
         ObjectID('1212'),     
         ObjectID('1213') 
              ]    
 }

Servers document example: 服务器文档示例:

{
    _id : ObjectId("1212"),
    name : 'Server1',
    ram: '250GB',
    type: 'processing',
    status: 'running' 
}

In this case you could query as: First you get the datacenter you need (assuming name is unique) 在这种情况下,您可以查询为:首先,您获得所需的数据中心(假设名称是唯一的)

datacenter = db.datacenter.findOne({name: "Gamma"})

then you will query for the servers' details you need; 然后,您将查询所需的服务器详细信息; example to get all servers in the given datacenter above 示例以获取上面给定数据中心中的所有服务器

servers = db.servers.find({_id: { $in : datacenter.servers } } )

after you have all the servers you can loop through each and check the status or something. 在拥有所有服务器之后,您可以遍历每个服务器并检查状态或其他内容。 You will end up having the server documents in servers variable. 您将最终在服务器变量中拥有服务器文档。

I hope that helps 我希望有所帮助

It is best to reference the Datacenter IDs in the Server documents. 最好在Server文档中引用Datacenter ID。 To retrieve the servers with the specified datacenter's ID, you'll simply query the server collection. 要检索具有指定数据中心ID的服务器,您只需查询服务器集合。 The query isn't difficult and looks like this: 查询并不困难,看起来像这样:

var dataID = datacenter._id

db.servercollection.find({ datacenter: dataID }, function(err, servers) {

});

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM