简体   繁体   English

查询64位整数字段时MongoDB find()变慢

[英]MongoDB find() slow when querying a 64-bit integer field

I have a Mongo collection called Elements containing ~9 million documents. 我有一个名为Elements的Mongo集合,其中包含约900万份文档。 Each document has the following structure: 每个文档具有以下结构:

{
  _id : "1",
  Timestamp : Numberlong(12345),
  Nationality : "ITA",
  Value: 5
}

If I run the following query: 如果我运行以下查询:

db.Elements.find({ Nationality: 'ITA' })

the query performs fast (a few milliseconds). 查询执行速度很快(几毫秒)。

If, instead, I run the following query: 相反,如果运行以下查询:

db.Elements.find({ Timestamp: 12345 })

the query is slow, in the order of magnitude of tens of seconds. 查询速度很慢,大约为数十秒。 Obviously, if I add an index on Timestamp , the query runs much faster. 显然,如果我在Timestamp上添加索引,则查询运行得更快。 Running the same query on the field Value , which is of type Int32, runs as fast as the first query. 在类型为Int32的Value字段上运行相同的查询,其运行速度与第一个查询一样快。

What I am trying to understand is: why would the second query (without index) perform significantly worse than the first? 我想了解的是:为什么第二个查询(没有索引)的性能明显比第一个查询差? Does Mongo treat Int64 values differently than other values? Mongo是否将Int64值与其他值区别对待?

It turns out I was making a mistake. 原来我在弄错。

I was using Robomongo to execute the queries; 我正在使用Robomongo执行查询; by default, Robomongo pages the results (the default page size is 50 items). 默认情况下,Robomongo会分页结果(默认页面大小为50个项目)。

Because the Timestamp field contains values that are almost always different, the query had to perform an almost-full scan before it could fill up and return one page. 由于“ Timestamp字段包含几乎始终不同的值,因此查询必须先进行几乎完整的扫描,然后才能填满并返回一页。 On the other hand, because the other fields contain values that have a limited range (the Value field, although it is Int32, has a limited domain in my application) I was getting results quickly because I was only looking at the first page. 另一方面,由于其他字段包含范围有限的值(“ Value字段,尽管它是Int32,但在我的应用程序中具有有限的域),所以我很快就得到了结果,因为我只看第一页。

When I run the same queries without pages (eg by appending a count or obtaining an execution plan) all the queries have poor performances without indexes. 当我在没有页面的情况下运行相同的查询时(例如,通过添加count或获取执行计划),所有查询在没有索引的情况下的性能都会很差。

Therefore, there doesn't seem to be any special treatment of Int64 values as opposed to other primitive types. 因此,与其他原始类型相比,似乎没有对Int64值进行任何特殊处理。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM