MongoDB聚合查询（组）花费太多时间

Question

The Group stage takes 5 min to excute and i have 100000000 records in my collection .小组赛执行需要 5 分钟，我的收藏中有 100000000 条记录。 i am using mongodb 4.2 and i have 8 cpu with 32gb RAM.我正在使用 mongodb 4.2，我有 8 个 CPU 和 32GB RAM。 Is there any better way to optimised query or index?有没有更好的方法来优化查询或索引？

db.getCollection("text").explain("executionStats").aggregate(
[
    { 
        "$match" : { 
            "CreatedDate" : { 
                "$gte" : ISODate("2021-01-01T15:43:50.325+0000"), 
                "$lte" : ISODate("2021-03-29T15:43:50.325+0000")
            }
        }
    }, 
    { 
        "$project" : { 
            "TX_DATE" : { 
                "$dateToString" : { 
                    "format" : "%Y-%m", 
                    "date" : "$CreatedDate"
                }
            }, 
            "Exp_Count" : 1.0
        }
    }, 
    { 
        "$group" : { 
            "_id" : { 
                "TX_DATE_Month" : "$TX_DATE"
            }, 
            "Exp_Count" : { 
                "$sum" : "$Exp_Count"
            }
        }
    }, 
    { 
        "$project" : { 
            "_id" : 0.0, 
            "TX_DATE" : "$_id.TX_DATE_Month", 
            "Exp_Count" : 1.0
        }
    }, 

    { 
        "$sort" : { 
            "TX_DATE" : 1.0
        }
    }
], 
{ 
    "allowDiskUse" : false
}

); );

Answer 1

$match stage: can help you with this, $match阶段：可以帮助你，

The matching stage is used to select the required documents only.匹配阶段仅用于选择所需的文件。 This matching will reduce our aggregation process to the required documents.这种匹配将减少我们对所需文档的聚合过程。 It's more similar to the where clause that we use in a MySQL query.它更类似于我们在 MySQL 查询中使用的 where 子句。 Matching helps us to use the indexing that we had created in the collection.匹配帮助我们使用我们在集合中创建的索引。 With the usage of indexed keys in the matching stage, it becomes easy to find and group required documents in a collection.通过在匹配阶段使用索引键，可以很容易地在集合中查找和分组所需的文档。 For example, To group the data of students by gender with age 13 in a school's data with age indexed.例如，将一所学校数据中年龄为 13 岁的学生按性别分组。 The command for the aggregation by gender:按性别聚合的命令：

db.SchoolData.aggregate([{’$match’:{’age’:13}},{’$group’:{’_id’:’$gender’}}])

This will reduce our focus to documents with an age 13 and with indexing on the same key this becomes much more efficient.这将减少我们对年龄为 13 岁的文档的关注，并且在同一键上建立索引会变得更有效率。

Note that,注意，

db.SchoolData.aggregate([{’$match’:{’age’:13}},{’$group’:{’_id’:’$gender’}}])

and和

db.SchoolData.aggregate([{’$group’:{’_id’:’$gender’}},{’$match’:{’age’:13}}])

will have an entirely different execution time since in the first command it performs the aggregation only on the documents with age 13 and in the second case, it does aggregation on all the documents and returns the results having age 13.将有一个完全不同的执行时间，因为在第一个命令中它只对年龄为 13 的文档执行聚合，在第二种情况下，它对所有文档进行聚合并返回年龄为 13 的结果。

MongoDB聚合查询（组）花费太多时间

问题描述

1 个解决方案

解决方案1
0 2021-07-06 11:14:13

MongoDB聚合查询（组）花费太多时间

问题描述

1 个解决方案

解决方案1 0 2021-07-06 11:14:13

解决方案1
0 2021-07-06 11:14:13