简体   繁体   中英

Ensuring index for nested repeating entities

I need to enforce unique constraint on a nested document, for example:

urlEntities: [ 
{ "url" : "http://t.co/ujBNNRWb0y" , "display_url" : "bit.ly/11JyiVp" ,  "expanded_url" :
"http://bit.ly/11JyiVp"} , 
{ "url" : "http://t.co/DeL6RiP8KR" , "display_url" : "ow.ly/i/2HC9x" , 
"expanded_url" : "http://ow.ly/i/2HC9x"}
]

url , display_url , and expaned_url need to be unique. How to issue ensureIndex command for this condition in MongoDB?

Also, is it a good design to have nested documents like this or should I move them to a separate collection and refer them from here inside urlEntities? I'm new to MongoDB, any best practices suggestion would be much helpful.

Full Scenario:

Say if I have a document as below in the db which has millions of data:

{ "_id" : { "$oid" : "51f72afa3893686e0c406e19"} , "user" : "test" , "urlEntities" : [ { "url" : "http://t.co/64HBcYmn9g" , "display_url" : "ow.ly/nqlkP" , "expanded_url" : "http://ow.ly/nqlkP"}] , "count" : 0}

When I get another document with similar urlEntities object, I need to update user and count fields only. First I thought of enforcing unique constraint on urlEntities fields and then handle exception and then go for an update, else if I check for each entry whether it exists before inserting, it will have significant impact on the performance. So, how can I enforce uniqueness in urlEntities ? I tried

{"urlEntities.display_url":1,"urlEntities.expanded_url":1},{unique:true}

But still I'm able to insert the same document twice without exceptions.

Uniqueness is only enforced per document. You can not prevent the following (simplified from your example):

db.collection.ensureIndex( { 'urlEntities.url' : 1 } );
db.col.insert( {
    _id: 42,
    urlEntities: [
        { 
            "url" : "http://t.co/ujBNNRWb0y"
        },
        { 
            "url" : "http://t.co/ujBNNRWb0y"
        } 
    ]
});

Similarily, you will have the same problem with a compound unique key for nested documents.

What you can do is the following:

db.collection.insert( {
    _id: 43,
    title: "This is an example",
} );
db.collection.update( 
    { _id: 43 },
    {
        '$addToSet': { 
            urlEntities: { 
                "url" : "http://t.co/ujBNNRWb0y" , 
                "display_url" : "bit.ly/11JyiVp" ,  
                "expanded_url" : "http://bit.ly/11JyiVp"
            }
        }
    }
);

Now you have the document with _id 43 with one urlEntities document. If you run the same update query again , it will not add a new array element, because the full combination of url, display_url and expanded_url already exists.

Also, have a look at the $addToSet query operator's examples: http://docs.mongodb.org/manual/reference/operator/addToSet/

for indexes on nested documents read this .

regarding the second part (nested documents best practices) - it really depends on your business logic and queries. if those nested documents don't make sense as first class entities, meaning you won't be searching for them directly but only in the context of their parent document then having them nested make sense. otherwise you should consider extracting them out.

i think that there isn't absolute answer to your question. read the chapter about indexing... it helped me a lot.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM