Mongo Aggregate Query Optimization

Question

I have a collection with 2.7million documents. I need to fetch some data based on certain condition. The problem is my query is scanning almost 1 million document to return only 5 documents.

Please help me to optimize this query and what index I should created to minimize the doc scan.

Here is my query

{
"aggregate": "posts",
    "pipeline": [
      {
        "$match": {
          "status": "A",
          "hashtagIds": {
            "$oid": "5d9c866d9f733d2359a3e0e0"
          },
          "mediaLocation.mediaType": 2,
          "mediaLocation.thumbNailPath": {
            "$exists": true,
            "$ne": null
          }
        }
      },
      {
        "$lookup": {
          "from": "users",
          "localField": "userId",
          "foreignField": "_id",
          "as": "ownerData"
        }
      },
      {
        "$unwind": {
          "path": "$ownerData",
          "preserveNullAndEmptyArrays": true
        }
      },
      {
        "$sort": {
          "viewsCount": -1
        }
      },
      {
        "$limit": 5
      }
    ]
}

Answer 1

A better index and a reordering of the stages should help a great deal.

Index

The current pipeline uses the index on

{
  "mediaLocation.mediaType": 1,
  status: 1,
  genter: 1
}

While this index does support 2 out of the 4 queried fields, it does not support the sort operation, so the query executor must load all of the matching documents into memory and sort them to determine which 5 fields are first.

This query would be served much better by an index that includes all of the queried fields, and the sort field. Note that the equality-matched fields come before the sort field in the index spec:

{
  "mediaLocation.mediaType": 1,
  status: 1,
  hashtagIds: 1,
  viewsCount: -1,
  "mediaLocation.thumbNailPath"
}

Stage order

In the existing pipeline:

$match: all 856k matching documents are retrieved
$lookup: 856k queries are executed against the users collection
$unwind: 856k array fields converted to object
$sort: in-memory sort of 856k documents
$limit: return the first 5 documents

A simple reordering of the fields, along with the above index, would significantly improve performance:

$match:
$sort:
$limit:
If the above index exists, placing these stages first allows the query planner to combine these 3 stages into one, identifying fields in pre-sorted order using the index, and stopping as soon as 5 matches are found. The combined stage will read 5 documents, plus the index keys
$lookup: executes 5 queries in the user collection
$unwind: convert 5 arrays to object

Mongo Aggregate Query Optimization

Question

1 answers

solution1
0 ACCPTED 2020-06-14 07:33:39

Mongo Aggregate Query Optimization

Question

1 answers

solution1 0 ACCPTED 2020-06-14 07:33:39

solution1
0 ACCPTED 2020-06-14 07:33:39