[英]MongoDB find() slow when querying a 64-bit integer field
I have a Mongo collection called Elements containing ~9 million documents. 我有一个名为Elements的Mongo集合,其中包含约900万份文档。 Each document has the following structure:
每个文档具有以下结构:
{
_id : "1",
Timestamp : Numberlong(12345),
Nationality : "ITA",
Value: 5
}
If I run the following query: 如果我运行以下查询:
db.Elements.find({ Nationality: 'ITA' })
the query performs fast (a few milliseconds). 查询执行速度很快(几毫秒)。
If, instead, I run the following query: 相反,如果运行以下查询:
db.Elements.find({ Timestamp: 12345 })
the query is slow, in the order of magnitude of tens of seconds. 查询速度很慢,大约为数十秒。 Obviously, if I add an index on
Timestamp
, the query runs much faster. 显然,如果我在
Timestamp
上添加索引,则查询运行得更快。 Running the same query on the field Value
, which is of type Int32, runs as fast as the first query. 在类型为Int32的
Value
字段上运行相同的查询,其运行速度与第一个查询一样快。
What I am trying to understand is: why would the second query (without index) perform significantly worse than the first? 我想了解的是:为什么第二个查询(没有索引)的性能明显比第一个查询差? Does Mongo treat Int64 values differently than other values?
Mongo是否将Int64值与其他值区别对待?
It turns out I was making a mistake. 原来我在弄错。
I was using Robomongo to execute the queries; 我正在使用Robomongo执行查询; by default, Robomongo pages the results (the default page size is 50 items).
默认情况下,Robomongo会分页结果(默认页面大小为50个项目)。
Because the Timestamp
field contains values that are almost always different, the query had to perform an almost-full scan before it could fill up and return one page. 由于“
Timestamp
字段包含几乎始终不同的值,因此查询必须先进行几乎完整的扫描,然后才能填满并返回一页。 On the other hand, because the other fields contain values that have a limited range (the Value
field, although it is Int32, has a limited domain in my application) I was getting results quickly because I was only looking at the first page. 另一方面,由于其他字段包含范围有限的值(“
Value
字段,尽管它是Int32,但在我的应用程序中具有有限的域),所以我很快就得到了结果,因为我只看第一页。
When I run the same queries without pages (eg by appending a count
or obtaining an execution plan) all the queries have poor performances without indexes. 当我在没有页面的情况下运行相同的查询时(例如,通过添加
count
或获取执行计划),所有查询在没有索引的情况下的性能都会很差。
Therefore, there doesn't seem to be any special treatment of Int64 values as opposed to other primitive types. 因此,与其他原始类型相比,似乎没有对Int64值进行任何特殊处理。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.