简体   繁体   中英

$sort makes my query too slow in MongoDB

I have a query like this, I want to sort my result by Date. I have an descending index on DateTime and an ascending index on UserId but when I try to sort my result by DateTime, it becomes too slow.

db.Users.aggregate([  
  { "$match" : { "UserId" : { "$in" : [NUUID("1b029f8b-a17e-3172-9247- 
                  9cddfaf9702b")] } } },       
  { "$match" : { "DateTime" : { "$gte" : ISODate("2018-08-15T12:54:38Z"), 
    "$lte" : ISODate("2018-08-25T12:54:38Z") } } },   
  { "$sort" : { "DateTime" : -1} }, { "$skip" : 0 }, { "$limit" : 20 }])

when I remove sort part, it becomes too fast. I tried as below and it was also too fast.

db.Users.aggregate([             
  { "$match" : { "DateTime" : { "$gte" : ISODate("2018-08-15T12:54:38Z"), 
    "$lte" : ISODate("2018-08-25T12:54:38Z") } } }, 
  { "$match" : { "UserId" : { "$in" : [NUUID("1b029f8b-a17e-3172-9247- 
     9cddfaf9702b")] } } },  
  { "$sort" : { "UserId" : 1} },{ "$skip" : 0 }, { "$limit" : 20 }])

Why it is slow only when I want to sort it by DateTime? This is the structure of my document

{
    "_id" : NUUID("11111111-1111-1111-1111-629f7992f895"),
    "DateTime" : ISODate("2018-08-23T15:49:51.153Z"),
    "UserId" : NUUID("aaaaaaaa-aaaa-aaaa-9247-9cddfaf9702b"),
    "PostId" : NUUID("bbbbbbbb-bbbb-bbbb-9529-d49ae48b2604"),
    "Type" : 3
}

Because by default MongoDb creates an unique index on the _id field which is what you are using when your sort is fast => { "UserId" : 1} .

Adding an index on the DateTime should help with the speed there.

Here are some considerations when it comes to sorting fields .

The performance issue with your first query is that you have created separate indexes on DateTime (descending) and UserId (ascending). MongoDB (as at 4.0) cannot use index intersection to sort query results when a sort operation is completely separate from the predicate, so if these are the only candidate indexes available only one can be chosen.

Note: although you have two $match stages in your source pipeline, the MongoDB server will coalesce these into a single $match stage which is the equivalent query using $and .

Why it is slow only when I want to sort it by DateTime?

Sorting results in-memory is considered an expensive operation and there is an aggregation stage memory limit (100MB) which cannot be exceeded unless you also add the allowDiskUse option to your aggregation. As at MongoDB 4.0, the query planner doesn't have stats about index cardinality so aggregation will favour the index plan supporting efficient sort (which is DateTime in your case). The outcome of your first query will be an index scan to find all matching DateTime values (in sorted order) as well as a comparison against each matching document with the UserId criteria.

In the second query sorted by UserId , the UserId index can be used for both matching and sorting results. Results still need to be filtered for DateTime , but the UserId criteria is likely much more selective so there are fewer documents to scan.

The ideal index to support both queries would be a compound index including both DateTime and UserId supporting the desired sort order. For example: db.Users.createIndex({ UserId: 1, DateTime: -1}) . If you add this compound index you can also drop the original { UserId:1} index since a prefix of the compound index can efficiently answer the same queries.

The most straightforward way to understand query performance would be to explain the aggregation query with executionStats . For aggregation pipelines this level of explain detail requires MongoDB 3.6+; for older server versions you can explain the equivalent find() query. Your aggregation query currently doesn't include any processing stages that can't be expressed in a standard find() query.

For more information see Use Indexes to Sort Query Results in the MongoDB documentation. The blog post Optimizing MongoDB Compound Indexes also has some helpful background (despite using explain output from an older version of MongoDB).

Add an index for the properties you are using in your query.

Mongo needs an index to efficiently sort or match data by a given property. Without it, Mongo must visit every single document in the collection to check the value of said property.

In your case, you want to make sure you have an index on UserId and DateTime for this aggregation.

Seeing as you have a PostId with which I imagine you use to perform queries, you should add an index for it as well.

You may also want to look at compound indexes => https://docs.mongodb.com/manual/core/index-compound .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM