MongoDB $group + $sum 聚合非常慢

Question

I have an aggregation query in MongoDB:我在 MongoDB 中有一个聚合查询：

[{
    $group: {
        _id: '$status',
        status: {
            $sum: 1
        }
    }
}]

It is running on a collection that has ~80 million documents.它在一个拥有约 8000 万份文档的集合上运行。 The status field is indexed, yet the query is very slow and runs for around 60 seconds or more.状态字段已编入索引，但查询速度非常慢，运行时间约为 60 秒或更长时间。

I did an explain() on the query, but still got almost nowhere:我对查询做了一个explain() ，但仍然几乎无处可去：

{
        "explainVersion" : "1",
        "stages" : [
                {
                        "$cursor" : {
                                "queryPlanner" : {
                                        "namespace" : "loa.document",
                                        "indexFilterSet" : false,
                                        "parsedQuery" : {

                                        },
                                        "queryHash" : "B9878693",
                                        "planCacheKey" : "8EAA28C6",
                                        "maxIndexedOrSolutionsReached" : false,
                                        "maxIndexedAndSolutionsReached" : false,
                                        "maxScansToExplodeReached" : false,
                                        "winningPlan" : {
                                                "stage" : "PROJECTION_SIMPLE",
                                                "transformBy" : {
                                                        "status" : 1,
                                                        "_id" : 0
                                                },
                                                "inputStage" : {
                                                        "stage" : "COLLSCAN",
                                                        "direction" : "forward"
                                                }
                                        },
                                        "rejectedPlans" : [ ]
                                }
                        }
                },
                {
                        "$group" : {
                                "_id" : "$status",
                                "status" : {
                                        "$sum" : {
                                                "$const" : 1
                                        }
                                }
                        }
                }
        ],
        "serverInfo" : {
                "host" : "rack-compute-2",
                "port" : 27017,
                "version" : "5.0.6",
                "gitVersion" : "212a8dbb47f07427dae194a9c75baec1d81d9259"
        },
        "serverParameters" : {
                "internalQueryFacetBufferSizeBytes" : 104857600,
                "internalQueryFacetMaxOutputDocSizeBytes" : 104857600,
                "internalLookupStageIntermediateDocumentMaxSizeBytes" : 104857600,
                "internalDocumentSourceGroupMaxMemoryBytes" : 104857600,
                "internalQueryMaxBlockingSortMemoryUsageBytes" : 104857600,
                "internalQueryProhibitBlockingMergeOnMongoS" : 0,
                "internalQueryMaxAddToSetBytes" : 104857600,
                "internalDocumentSourceSetWindowFieldsMaxMemoryBytes" : 104857600
        },
        "command" : {
                "aggregate" : "document",
                "pipeline" : [
                        {
                                "$group" : {
                                        "_id" : "$status",
                                        "status" : {
                                                "$sum" : 1
                                        }
                                }
                        }
                ],
                "explain" : true,
                "cursor" : {

                },
                "lsid" : {
                        "id" : UUID("a07e17fe-65ff-4d38-966f-7517b7a5d3f2")
                },
                "$db" : "loa"
        },
        "ok" : 1
}

I see that it does a full COLLSCAN , I just can't understand why.我看到它做了一个完整的COLLSCAN ，我就是不明白为什么。

I plan on supporting a couple hundred million (or even a billion) documents in that collection, but this problem hijacks my plans for seemingly no reason.我计划在该集合中支持几亿（甚至十亿）个文档，但这个问题似乎无缘无故地劫持了我的计划。

Answer 1

You can advice the query planner to use the index as follow:您可以建议查询规划器使用索引，如下所示：

db.test.explain("executionStats").aggregate(
   [
     {$group:{ _id:"$status" ,status:{$sum:1} }}
   ],
     {hint:"status_1"}
   )

Make sure the index name in the hint is same as created ... (db.test.getIndexes() will show you the exact index name )确保提示中的索引名称与创建的相同...（db.test.getIndexes() 将显示确切的索引名称）

MongoDB $group + $sum 聚合非常慢

问题描述

1 个解决方案

解决方案1
1 2022-06-07 20:03:13

MongoDB $group + $sum 聚合非常慢

问题描述

1 个解决方案

解决方案1 1 2022-06-07 20:03:13

解决方案1
1 2022-06-07 20:03:13