I have two collections with 15 millions plus results of ids and I would like to compare the two so that I can return a result set of cola that is not in colb
cola has 14.5 million documents and colb has 15.5 documents:
Example of cola
{
"_id" : "123"
},
{
"_id" : "45"
}
Example of colb
{
"_id" : "123"
},
{
"_id" : "456"
},
{
"_id" : "4"
}
I would lke he results to be
{
"_id" : "456"
},
.
.
.
{
"_id" : "4"
}
Using $lookup hangs and using distinct errors out at too large 16mb. I have also used aggregate and $nin but because aggregate is always an object, $nin errors out as it expects an array.
Hangs and never finishes.
db.cola.aggregate([
{
$lookup: {
from: "colb",
localField: "ID",
foreignField: "ID",
as: "ID_match"
}
},
{
$match: {
$expr: {
$eq: [ { "$size": "$ID_match" }, 0 ]
}
}
}
])
cyclic dependency detected
var a = db.cola.aggregate({$group: {_id: "$ClaimID"}});
db.cola.find({ID: {$nin: a}})
I also wrote a JS loop but looping through 15 million rows is not efficient.
What else are my options?
i think you have same problem like this link . you can use $lookup aggregation function to join two table and take some value what you need
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.