简体   繁体   中英

MongoDB merge two collections with unmatched documents

I am trying to compare and find different documents from two collections below are the samples, Mongodb version:4.0 , ORM:mongoose

**col1: Has one new document**

    { "id" : 200001, "mobileNo" : #######001 }
    { "id" : 200002, "mobileNo" : #######002 } //mobileNo may not be unique.
    { "id" : 200003, "mobileNo" : #######002 }
    { "id" : 200004, "mobileNo" : #######004 }

**col2:**

    { "id" : 200001, "mobileNo" : #######001 }
    { "id" : 200002, "mobileNo" : #######002 }
    { "id" : 200003, "mobileNo" : #######003 }

Now I want to insert the document { "id": 200004, "mobileNo": #######004 } from col1 to col2 ie; the documents which doesn't match.

This is what I've tried so far:

const col1= await Col1.find({}, { mobileNo: 1,id: 1, _id: 0 })
 col1.forEach(async function (col1docs) {
    let col2doc = await Col2.find({ mobileNo: { $ne: col1docs.mobileNo}, 
    id:{$ne:col1docs.id} }, { mobileNo: 1, _id: 0, id: 1 })
     if (!(col2doc)) {
            Col2.insertMany(col1docs);

            }
        });

I have also tried with $eq instead of $ne but neither i am getting the unmatched documents nor they are getting inserted. Any suggestions??? Combination of id+phoneNo is unique

I would say instead of doing two .find() calls plus iteration & then third call to write data, try this query:

db.col1.aggregate([
    {
      $lookup: {
        from: "col2",
        let: { id: "$id", mobileNo: "$mobileNo" },
        pipeline: [
          {
            $match: { $expr: { $and: [ { $eq: [ "$id", "$$id" ] }, { $gte: [ "$mobileNo", "$$mobileNo" ] } ] } }
          },
          { $project: { _id: 1 } } // limiting to `_id` as we don't need entire doc of `col2` - just need to see whether a ref exists or not
        ],
        as: "data"
      }
    },
    { $match: { data: [] } // Filtering leaves docs in `col1` which has no match in `col2`
    },
    { $project: { data: 0, _id: 0 } }
 ])

Test: mongoplayground

Details: From the above query you're taking advantage of specifying conditions in $lookup to get docs from col1 which have reference in col2 . Let's say $lookup will run on each document of col1 - So with the unique combination of id & mobileNo from current document in col1 has a matching in col2 then col2 doc's _id will be pushed in data array, at the end what we get out of col1 is data: [] to say no matching docs were found for these col1 doc's. Now you can just write all the returned docs to col2 using .insertMany() . Actually you can do this entire thing using $merge on MongoDB version > 4.2 without any need of 2nd write call( .insertMany() ).

For your scenario on MongoDB version > 4.2 something like this will merge docs to second collection:

{ $merge: 'col2' } // Has to be final stage in aggregation

Note: If this has to be done periodically - no matter how you do this, try to minimize data that you're operating on, maybe maintain a time field & you can use that field to filter docs first & do this job, or you can also take advantage of _id to say we've done for all these docs in last run & we need to start from this docs - which helps you a lot to reduce data to be worked on. Additionally don't forget to maintain indexes.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM