简体   繁体   English

MongoDB - 对旧文档的慢查询(聚合和排序)

[英]MongoDB - slow query on old documents (aggregation and sorting)

I have two DBs for testing and each contains thousands/hundreds of thousand of documents.我有两个用于测试的数据库,每个数据库包含数千/数十万个文档。 But with the same Schemas and CRUD operations.但具有相同的模式和 CRUD 操作。

Let's call DB1 and DB2.我们称之为 DB1 和 DB2。

I am using Mongoose Suddenly DB1 became really slow during:我正在使用 Mongoose,DB1 在以下期间突然变得非常慢:

const eventQueryPipeline = [
  {
    $match: {
      $and: [{ userId: req.body.userId }, { serverId: req.body.serverId }],
    },
  },
  {
    $sort: {
      sort: -1,
    },
  },
];

const aggregation = db.collection
  .aggregate(eventQueryPipeline)
  .allowDiskUse(true);
aggregation.exect((err, result) => {
  res.json(result);
});

In DB2 the same exact query runs in milliseconds up to maximum a 10 seconds在 DB2 中,相同的精确查询以毫秒为单位运行,最长可达 10 秒
In DB1 the query never takes less than 40 seconds.在 DB1 中,查询永远不会少于 40 秒。

I do not understand why.我不懂为什么。 What could I be missing?我会错过什么? I tried to confront the Documents and the Indexes and they're the same.我试图面对文档和索引,它们是相同的。 Deleting the collection and restrting saving the documents, brings the speed back to normal and acceptable, but why is it happening?删除集合并重新保存文档,使速度恢复正常并且可以接受,但是为什么会发生这种情况? Does someone had same experience?有人有同样的经历吗?

Short answer:简短的回答:

You should create following index:您应该创建以下索引:

{ "userId": 1, "serverId": 1, "sort": 1 }

Longer answer更长的答案

Based on your code (i see that you have .allowDiskUse(true) ) it looks like mongo is trying to do in memory sort with "a lot" of data.根据您的代码(我看到您有.allowDiskUse(true) ),看起来 mongo 正在尝试对“大量”数据进行内存排序。 Mongo has by default 100MB system memory limit for sort operations, and you can allow it to use temporary files on disk to store data if it hits that limit. Mongo 默认对排序操作有 100MB 的系统内存限制,如果达到该限制,您可以允许它使用磁盘上的临时文件来存储数据。 You can read more about it here: https://www.mongodb.com/docs/manual/reference/method/cursor.allowDiskUse/您可以在此处阅读更多相关信息:https: //www.mongodb.com/docs/manual/reference/method/cursor.allowDiskUse/

In order to optimise the performance of your queries, you can use indexes.为了优化查询的性能,您可以使用索引。 Common rule that you should follow when planning indexes is ESR (Equality, Sort, Range).规划索引时应遵循的通用规则是 ESR(平等、排序、范围)。 You can read more about it here: https://www.mongodb.com/docs/v4.2/tutorial/equality-sort-range-rule/您可以在此处阅读更多相关信息:https: //www.mongodb.com/docs/v4.2/tutorial/equality-sort-range-rule/

If we follow that rule while creating our compound index, we will add equality matches first, in your case "userId" and "serverId" .如果我们在创建复合索引时遵循该规则,我们将首先添加相等匹配,在您的情况下为"userId""serverId" After that comes the sort field, in your case "sort" .之后是排序字段,在您的情况下是"sort"

If you had a need to additionally filter results based on some range (eg. some value greater than X, or timestamp greater than yday), you would add that after the "sort".如果您需要根据某个范围(例如大于 X 的某个值或大于 yday 的时间戳)额外过滤结果,您可以在“排序”之后添加它。

That means your index should look like this:这意味着您的索引应如下所示:

schema.index({ userId: 1, serverId: 1, sort: 1 });

Additionally, you can probably remove allowDiskUse, and handle err inside aggregation.exec callback (im assuming that aggregation.exect is a typo)此外,您可能可以删除allowDiskUse,并在aggregation.exec回调中处理错误(我假设aggregation.exect是一个错字)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM