Current MongoDB query, takes upto 5 mins to search through 2 documents, when each document has 10,000 contacts, Please suggest ways to improve this significantly.
I am trying to search for a phone number in hundreds of documents. Each document belongs to a user and each user has a contacts array (as you can see in the below code) with 10,000 objects and each object can have 2 to 3 phone numbers. (See below document structure). If a phone number is found in multiple documents, I need the MongoDB query to return an array with userNumber's found in those documents.
Below is the structure of the document I have in MongoDB collection. For simplicity, I showed only one object in contacts array, infact there are thousands of objects
{
"_id": { "$oid": "61d1f04266289f003452d705" },
"userID": { "$oid": "61d1efea2c0fab00340f47c8" },
"contacts": [
{
"emailAddresses": [
{ "id": "6884", "label": "email1", "email": "addedemail@gmail.com" }
],
"phoneNumbers": [
{
"label": "other",
"id": "4594",
"number": "+918984292930"
},
{
"label": "other",
"id": "4595",
"number": "+911234567890"
}
],
"_id": { "$oid": "61d1f04266289f003452d744" },
"ContactName": "Sample User 1 Name Changed",
"ContactNumber": "+918984292930",
"recordID": "833"
}
],
"userNumber": "+911234567890",
"__v": 7
}
Current MongoDB Query:
await ContactModel.aggregate([
{
$match: {
userNumber: userNumber,
},
},
{
$unwind: "$contacts",
},
{
$lookup: {
from: "phonenumbers",
let: {
contactNumberVar: "$contacts.ContactNumber",
},
pipeline: [
{ $unwind: "$contacts" },
{
$project: {
userNumber: 1,
"contacts.ContactNumber": 1,
},
},
{
$match: {
$and: [
{ $expr: { $eq: ["$$contactNumberVar", "$userNumber"] } },
{
$expr: {
$eq: [contactNumber, "$contacts.ContactNumber"],
},
},
],
},
},
],
as: "mutualContacts",
},
},
{
$project: {
userID: 1,
"mutualContacts.userNumber": 1,
},
},
{
$group: {
_id: "$userID",
mutualContacts: {
$push: {
$cond: [
{ $gt: [{ $size: "$mutualContacts" }, 0] },
{ $arrayElemAt: ["$mutualContacts.userNumber", 0] },
"$$REMOVE",
],
},
},
},
},
]).exec()
First of all ensure you have indexes that support the query on both collections.
{userNumber:1}
Should be a good candidate, but please test other options.
Next - query optimisation. In the lookup pipeline:
pipeline: [
{ $unwind: "$contacts" },
{
$project: {
userNumber: 1,
"contacts.ContactNumber": 1,
},
},
{
$match: {
$and: [
{ $expr: { $eq: ["$$contactNumberVar", "$userNumber"] } },
{
$expr: {
$eq: [contactNumber, "$contacts.ContactNumber"],
},
},
],
},
},
],
You unwind whole phonenumbers
collection. Match it first and unwind/project only matching documents instead:
pipeline: [
{
$match: {
$and: [
{ $expr: { $eq: ["$$contactNumberVar", "$userNumber"] } },
{
$expr: {
$eq: [contactNumber, "$contacts.ContactNumber"],
},
},
],
},
},
{ $unwind: "$contacts" },
{
$project: {
userNumber: 1,
"contacts.ContactNumber": 1,
},
},
{
$match: {
$and: [
{ $expr: { $eq: ["$$contactNumberVar", "$userNumber"] } },
{
$expr: {
$eq: [contactNumber, "$contacts.ContactNumber"],
},
},
],
},
},
],
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.