简体   繁体   中英

One-to-Many Mongoose relationship - Where to store the reference

I'm designing the MongoDB collection architecture for a new project, and being new to MongoDB, I had a question about a one-to-many relationship.

For the sake of this example, lets say the relationship is Datacenter-to-Servers, meaning one Datacenter can have multiple servers (thousands, not limited in the app), and the servers can only belong to one Datacenter.

Would it be best to have Servers._datacenter referencing the Datacenter._id ? Or a Datacenter.servers array to store the server ID's?

If you suggest having an array in the Datacenter documents to reference what server ID's are associated to it... Then is there a way to find out what Datacenter a server belongs to when you just have the server ID? (Kinda like a quick where serverId in Datacenter.servers query) without having to query every Datacenter, and then check for the ID in every Datacenter.servers array

If you suggest having an element in the Servers documents to reference what Datacenter it belongs to, then is there a way to query for the Datacenter, and return all of the associated Server documents inside a virtual Documents.servers array or something?

Im not quite sure what the best route to take is, since there can be a very very large amount of servers for each datacenter, I think it may be a better idea to not have such a large array inside each Datacenter document... But then if I set it up so that each Server document has the parent Datacenter referenced in it, that makes queries rather difficult (Or not? Maybe theres a very easy way I just haven't discovered, I did say I'm new to Mongo)

I was reading through this document , and it shows how set the reference direction up either way, and it states:

To avoid mutable, growing arrays, store the publisher reference inside the book document

So that makes me think it would be best to reference the Datacenter ID's in the Server documents.. So if thats the case, is there a way to return all the server documents as an array inside the Datacenter documents? Or would I have to query for the Datacenter, then query for all the Servers with that Datacenter._id, then return a merged object..

It would depend on the access pattern. How you are planning to code this as null1941 said.

If the number of servers are in 10s or hundreds I guess that would be a one to few relationship instead of one to many so you could go ahead and embed datacenters inside of servers. This means you will be getting all the information you need in one go and a single query. This approach could work if you can guarantee consistency but you will end up having duplication given many servers exist in one datacenter. So datacenter document could be duplicated in many server document. This approach can work if again you can guarantee consistency and that datacenters maybe have few information on them. The only advantage with this approach is that you only be doing one query. Generally this approach is not recommended ; also if you want to treat datacenter as a separate document such that you want to run some operations on it than avoid this approach.

if you decided to go for this approach; To embed datacenter as an array You can use $all or $in to search inside the array.

example:

{
"_id" : ObjectId("63546464sad65s4ad3654"),
"name" : "Server1",
"datacenter" : ["gamma", "500"]          

}

query:

db.users.find({ "datacenter": { $in: [ "gamma", "delta" ] } } )

if you decided to embed servers as a document (you can embed datacenter document as well inside servers both can work). So for embedding servers inside of datacenter document, you could search inside the embedded document using the dot notation. example:(servers is the dictionary, name is an attribute inside of servers):

{
"_id" : ObjectId("63546464sad65s4ad3654"),
"name" : "gamma",
"servers" : [
            {
              "title" : "server1",
              "speed" : "3.2GHZ",
              "ram"   : "200GB"
            },
            {
              "title" : "server2",
              "speed" : "3.2GHZ",
              "ram"   : "64GB"
            }
         ]
}

query:

db.datacenters.find( { "servers.title": "server1" } 

Again you judge. However you decide to do it there is a way in mongodb to retrieve the information you need.

Now keep in mind that if you decided to go for embedding servers inside of datacenter document that in mongodb a single document should not exceed 16MB. If by embedding this size could be exceeded you should go splitting approach (below).

Now the better approach for your is the case which is not embedding; basically as gnerkus said. However keep in mind that there is no foreign key constraints in mongodb you have to ensure consistency using the application. Such that server_id in datacenter collection could be found in server collection (and vise versa). You could also put datacenter_id inside of server collection; my way of deciding which one to go for is my use case. For example if most of my operations are on datacenters I will add server_id to it. if most of my operations are on the server collection I will add datacenter_id to it. In both cases you will be doing two or more queries. Here is an example:

Datacenter document example

 {
     _id : ObjectId("10001000010000"),
     name : 'Gamma',        
     location: 'pluto',
     servers: [      
         ObjectID('1212'),     
         ObjectID('1213') 
              ]    
 }

Servers document example:

{
    _id : ObjectId("1212"),
    name : 'Server1',
    ram: '250GB',
    type: 'processing',
    status: 'running' 
}

In this case you could query as: First you get the datacenter you need (assuming name is unique)

datacenter = db.datacenter.findOne({name: "Gamma"})

then you will query for the servers' details you need; example to get all servers in the given datacenter above

servers = db.servers.find({_id: { $in : datacenter.servers } } )

after you have all the servers you can loop through each and check the status or something. You will end up having the server documents in servers variable.

I hope that helps

It is best to reference the Datacenter IDs in the Server documents. To retrieve the servers with the specified datacenter's ID, you'll simply query the server collection. The query isn't difficult and looks like this:

var dataID = datacenter._id

db.servercollection.find({ datacenter: dataID }, function(err, servers) {

});

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM