简体   繁体   English

为什么这个查询会导致“太多子句”?

[英]Why does this query cause 'too many clauses'?

I have a query with only a few 'shoulds' and 'filters', but one of the filters has a terms query with ~20,000 terms in it.我有一个只有几个“应该”和“过滤器”的查询,但其中一个过滤器有一个包含约 20,000 个术语的术语查询。 Our max_terms_count is 200k but this is complaining about 'clauses'.我们的 max_terms_count 是 200k,但这是在抱怨“条款”。

Caused by: org.elasticsearch.ElasticsearchException: Elasticsearch exception [type=too_many_clauses, reason=too_many_clauses: maxClauseCount is set to 1024]由以下原因引起:org.elasticsearch.ElasticsearchException:Elasticsearch 异常 [type=too_many_clauses,原因=too_many_clauses:maxClauseCount 设置为 1024]

I've written queries containing terms queries with far more terms than this.我已经编写了包含术语查询的查询,其术语远不止于此。 Why is this query causing a 'too many clauses' error?为什么此查询会导致“子句过多”错误? How can I rewrite this query to get the same result without the error?如何重写此查询以获得相同的结果而不会出现错误?

{
    "query" : {
      "bool" : {
        "filter" : [
          {
            "nested" : {
              "query" : {
                "range" : {
                  "dateField" : {
                    "from" : "2019-12-03T21:34:30.653Z",
                    "to" : "2020-12-02T21:34:30.653Z",
                    "include_lower" : true,
                    "include_upper" : true,
                    "boost" : 1.0
                  }
                }
              },
              "path" : "observed_feeds",
              "ignore_unmapped" : false,
              "score_mode" : "none",
              "boost" : 1.0
            }
          }
        ],
        "should" : [
          {
            "bool" : {
              "filter" : [
                {
                  "terms" : {
                    "ipAddressField" : [
                      "123.123.123.123",
                      "124.124.124.124",
                      ... like 20,000 of these
                    ],
                    "boost" : 1.0
                  }
                }
              ],
              "adjust_pure_negative" : true,
              "boost" : 1.0
            }
          }
        ],
        "adjust_pure_negative" : true,
        "minimum_should_match" : "1",
        "boost" : 1.0
      }
    }
}

Edit: one note - The reason I'm wrapping the terms query in a should -> bool is because there are times where we need to have multiple terms queries OR'd together.编辑:一个注意事项——我将术语查询包装在 should -> bool 中的原因是因为有时我们需要将多个术语查询“或”在一起。 This happened to not be one of them.这恰好不是其中之一。

The reason you are facing this with terms query is because the should clause is outside filter clause and contributing to score calculation.您使用 terms 查询面对这个问题的原因是因为should子句在filter子句之外并且有助于分数计算。 This is the reason these terms are subject to max_clause_count.这就是这些条款受 max_clause_count 约束的原因。 If score is not required for that part then you can rephrase you query as below:如果该部分不需要分数,那么您可以按如下方式重新表述您的查询:

{
  "query": {
    "bool": {
      "filter": [
        {
          "nested": {
            "query": {
              "range": {
                "dateField": {
                  "from": "2019-12-03T21:34:30.653Z",
                  "to": "2020-12-02T21:34:30.653Z",
                  "include_lower": true,
                  "include_upper": true,
                  "boost": 1
                }
              }
            },
            "path": "observed_feeds",
            "ignore_unmapped": false,
            "score_mode": "none",
            "boost": 1
          }
        },
        {
          "bool": {
            "should": [
              {
                "bool": {
                  "filter": [
                    {
                      "terms": {
                        "ipAddressField": [
                          "123.123.123.123",
                          "124.124.124.124",
                          ... like 20,000 of these
                        ],
                        "boost": 1
                      }
                    }
                  ],
                  "adjust_pure_negative": true,
                  "boost": 1
                }
              }
            ]
          }
        }
      ],
      "adjust_pure_negative": true,
      "boost": 1
    }
  }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 ElasticSearch too_many_nested_clauses 查询包含太多嵌套子句; maxClauseCount 设置为 1024 - ElasticSearch too_many_nested_clauses Query contains too many nested clauses; maxClauseCount is set to 1024 预期[END_OBJECT]但得到[FIELD_NAME],kibana中可能有太多查询子句错误 - expected [END_OBJECT] but got [FIELD_NAME], possibly too many query clauses error in kibana PHP应该是[END_OBJECT],但得到了[FIELD_NAME],可能有太多的查询子句 - PHP expected [END_OBJECT] but got [FIELD_NAME], possibly too many query clauses 弹性搜索too_many_clauses:maxClauseCount设置为1024' - elastic search too_many_clauses: maxClauseCount is set to 1024' Elasticsearch:为什么我的查询返回太多结果? - Elasticsearch: Why is my query returning too many results? 查询太多滚动上下文 - Query on too many scroll contexts _explain查询,请解释为什么子句看起来像这样 - _explain query please explain why clauses look like this 为什么它们限制了Elasticsearch中布尔查询中子句的最大数目 - Why is their a limit on max number of clauses in bool query in elasticsearch 使用过多的术语元素优化 ES 查询 - Optimize ES query with too many terms elements ElasticSearch:使用OR AND子句创建布尔查询 - ElasticSearch: Creating a bool query with OR AND clauses
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM