Indexing not utilized during the MongoDB aggregation query

Question

I have stuck somewhere in MongoDB aggregate query. I tried to generate a summary report from the database which contains 110M records. during the report generation, I faced the following issues 1).Even though the collection is indexed they are not utilized for the search. 2).Once query execution finished memory of DB server not decreased. 3)query take considerable time to return the result.

im useing mongodb Atlas v4.2.8 sample document

{
    "_id": {
        "$oid": "5eb122f714d0510011e3a184"
    },
    "from": "Star_friends",
    "to": "94713414047",
    "accountName": "ZM",
    "accountId": "ZM",
    "campaignName": "test 1",
    "campaignId": "5eb122f1e921c3001922f73c",
    "campaignType": "BULK",
    "status": {
        "$numberInt": "3"
    },
    "reason": "No Routing",
    "channel": "sms",
    "messageType": {
        "$numberInt": "1"
    },
    "event": "MT",
    "content": "test 132",
    "credit": {
        "$numberInt": "1"
    },
    "msgId": "",
    "createdDateTime": "2020-05-05T13:55:27.743Z",
    "updatedTime": "2020-05-05T13:55:27.745Z",
    "uDate": "2020-05-05",
    "operator": "mobitel"
}

my query as follows

db.getCollection('report').aggregate([{
    "$match": {
        "createdDateTime": {
            "$gt": "2020-09-14T00:00:01.000Z",
            "$lt": "2020-09-15T23:59:99.999Z"
        },
        "messageType": {
            "$in": [1, 2]
        },
        "channel": {
            "$in": ["sms", "viber", "whatsapp"]
        },
        "accountId": {
            "$in": ["ZM", "KEELLS"]
        }
    }
}, {
    "$project": {
        "_id": 0,
        "channel": 1,
        "messageType": 1,
        "accountName": 1,
        "accountId": 1,
        "createdDateTime": 1,
        "uDate": 1,
        "credit": 1,
        "status": 1
    }
}, {
    "$group": {
        "_id": {
            "channel": "$channel",
            "messageType": "$messageType",
            "accountName": "$accountName",
            "accountId": "$accountId",
            "filteredDate": {
                "$substr": ["$createdDateTime", 0, 7]
            },
            "sortDate": "$uDate"
        },
        "total": {
            "$sum": "$credit"
        },
        "send": {
            "$sum": {
                "$cond": [{
                    "$in": ["$status", [2, 15, 1, 14, 6, 17, 4, 5]]
                }, "$credit", 0]
            }
        },
        "delivered": {
            "$sum": {
                "$cond": [{
                        "$in": ["$status", [6, 17, 4]]
                    },
                    "$credit",
                    0
                ]
            }
        },
        "deliveryFailed": {
            "$sum": {
                "$cond": [{
                    "$in": ["$status", [12, 5]]
                }, "$credit", 0]
            }
        },
        "failed": {
            "$sum": {
                "$cond": [{
                    "$in": ["$status", [3]]
                }, "$credit", 0]
            }
        },
        "datass": {
            "$addToSet": {
                "channel": "$channel",
                "messageType": "$messageType",
                "accountName": "$accountName",
                "accountId": "$accountId",
                "filteredDate": {
                    "$substr": ["$createdDateTime", 0, 7]
                },
                "sortDate": "$uDate"
            }
        }
    }
}, {
    "$unwind": "$datass"
}, {
    "$project": {
        "_id": 0
    }
}, {
    "$sort": {
        "datass.sortDate": -1
    }
}])

indexes as follows

accountId_1 / accountId_1_createdDateTime_-1 / campaignId_-1 / channel_1 / createdDateTime_-1 / messageType_1 / msgId_-1 / msgId_-1_status_1

I would be appreciated if someone can help me with this

Thanks

Answer 1

You gave us little information. How many documents should average query like such return? How long does it take to execute the said query?

What I can see here is that your match pipeline is good, because you are trying to filter out documents by fields that are indexed. But what is a "performance smell" here is your $sort function which does sorting on non-indexed field. Try to do sorting immediately after $match.

Play with it a little more and try to figure out which stage of the pipeline is a performance bottle-neck.

Answer 2

I have resolved my issue by changing my indexes

accountId_1_createdDateTime_-1 / msgId_-1_status_1 / accountId_1_messageType_1_channel_1_createdDateTime_1_accountName_1_uDate_1_credit_1_status_1

Indexing not utilized during the MongoDB aggregation query

Question

2 answers

solution1
0 2020-11-17 09:41:37

solution2
0 ACCPTED 2020-12-08 13:03:27

Indexing not utilized during the MongoDB aggregation query

Question

2 answers

solution1 0 2020-11-17 09:41:37

solution2 0 ACCPTED 2020-12-08 13:03:27

solution1
0 2020-11-17 09:41:37

solution2
0 ACCPTED 2020-12-08 13:03:27