简体   繁体   English

_stats与_count返回的弹性搜索文档计数

[英]Elasticsearch document count returned by _stats versus _count

I'm trying to get statistics/counts on indices in my elasticsearch cluster (1.2.1). 我正在尝试获取弹性搜索集群中的索引的统计数据/计数(1.2.1)。 I was using the Indices Stats API (_stats endpoint) to get the total number of primary documents and their size on disk. 我使用Indices Stats API (_stats端点)来获取主文档的总数及其在磁盘上的大小。 However, I started experimenting with the Count API (_count endpoint) and noticed that the values do not align. 但是,我开始尝试使用Count API (_count端点)并注意到值不对齐。

What is the difference between these values? 这些值有什么区别? It's not entirely clear from the documentation though a clue in the documentation indicates that the value returned from Indicies Stats can change when refreshing the index. 虽然文档中的线索表明刷新索引时Indicies Stats返回的值可能会发生变化,但文档中并未完全清楚。 This makes me wonder if this is a lower-level value from the Lucene layer. 这让我想知道这是否是Lucene层的低级值。

Indices Stats API 指数统计API

localhost:9200/my_index/_stats

...snip...

"_all" : {
  "primaries" : {
    "docs" : {
      "count" : 8284,
      "deleted" : 87
    },
  }
}

...snip...

Count API 计算API

localhost:9200/my_index/_count

{
  "count" : 6854,
  "_shards" : {
    "total" : 40,
    "successful" : 40,
    "failed" : 0
  }
}

Actually, the docs.count you get back from the Indices stats API also includes the count of nested documents present in the index so it will always be greater or equals than the count you get back from the Count API, which only returns the count of top-level documents, ie documents that would be returned from a search query. 实际上,从Indices stats API返回的docs.count还包括索引中存在的嵌套文档的数量,因此它总是大于或等于从Count API返回的计数,它只返回计数的计数顶级文档,即将从搜索查询返回的文档。

So, judging by the numbers you posted, it looks like your index contains documents with fields whose type is nested in the mapping. 因此,根据您发布的数字判断,您的索引看起来像包含类型nested在映射中的字段的文档。 Sounds correct? 听起来不错?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM