Elasticsearch：按數組過濾文檔，請求中包含所有文檔數組元素

Question

我存儲在 elasticsearch 中的文檔具有以下結構：

{
  "id": 1,
  "test": "name",
  "rules": [
    {
      "id": 2,
      "name": "rule1",
      "ruleDetails": [
        {
          "id": 3,
          "requiredAnswerId": 1
        },
        {
          "id": 4,
          "requiredAnswerId": 2
        },
        {
          "id": 5,
          "requiredAnswerId": 3
        }
      ]
    }
  ]
}

其中， rules屬性具有nested類型。

我需要通過檢查在搜索請求中傳遞的requiredAnswerId數組（提供的術語）包含存儲在文檔中的所有rules.ruleDetails.requiredAnswerId來查詢文檔。

有誰知道我可以使用哪個 elasticsearch 選項來構建這樣的特定查詢？ 或者，最好獲取整個文檔並在應用程序級別執行過濾。

更新添加映射

{
  "my_index": {
    "mappings": {
      "properties": {
        "id": {
          "type": "long"
        },
        "test": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword"
            }
          }
        },
        "rules": {
          "type": "nested",
          "properties": {
            "id": {
              "type": "long"
            },
            "name": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "ruleDetails": {
              "properties": {
                "id": {
                  "type": "long"
                },
                "requiredAnswerId": {
                  "type": "long"
                }
              }
            }
          }
        }
      }
    }
  }
}

Answer 1

映射：

{
  "index4" : {
    "mappings" : {
      "properties" : {
        "id" : {
          "type" : "integer"
        },
        "rules" : {
          "type" : "nested",
          "properties" : {
            "id" : {
              "type" : "integer"
            },
            "name" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword"
                }
              }
            },
            "ruleDetails" : {
              "properties" : {
                "id" : {
                  "type" : "long"
                },
                "requiredAnswerId" : {
                  "type" : "long"
                }
              }
            }
          }
        },
        "test" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword"
            }
          }
        }
      }
    }
  }
}

查詢：這將需要使用從性能角度來看不好的腳本。 我正在遍歷所有文檔並檢查是否存在字段是否傳遞參數

{
  "query": {
    "nested": {
      "path": "rules",
      "query": {
        "script": {
          "script": {
            "source": "for(a in doc['rules.ruleDetails.requiredAnswerId']){if(!params.Ids.contains((int)a)) return false; }  return true;",
            "params": {
              "Ids": [
                1,
                2,
                3
              ]
            }
          }
        }
      },
      "inner_hits": {}
    }
  }
}

結果：

  "hits" : [
      {
        "_index" : "index4",
        "_type" : "_doc",
        "_id" : "TxOpvnEBf42mOjxvvLQB",
        "_score" : 4.0,
        "_source" : {
          "id" : 1,
          "test" : "name",
          "rules" : [
            {
              "id" : 2,
              "name" : "rule1",
              "ruleDetails" : [
                {
                  "id" : 3,
                  "requiredAnswerId" : 1
                },
                {
                  "id" : 4,
                  "requiredAnswerId" : 2
                },
                {
                  "id" : 5,
                  "requiredAnswerId" : 3
                }
              ]
            },
            {
              "id" : 3,
              "name" : "rule3",
              "ruleDetails" : [
                {
                  "id" : 3,
                  "requiredAnswerId" : 1
                },
                {
                  "id" : 4,
                  "requiredAnswerId" : 2
                }
              ]
            }
          ]
        },
        "inner_hits" : {
          "rules" : {
            "hits" : {
              "total" : {
                "value" : 1,
                "relation" : "eq"
              },
              "max_score" : 4.0,
              "hits" : [
                {
                  "_index" : "index4",
                  "_type" : "_doc",
                  "_id" : "TxOpvnEBf42mOjxvvLQB",
                  "_nested" : {
                    "field" : "rules",
                    "offset" : 0
                  },
                  "_score" : 4.0,
                  "_source" : {
                    "id" : 2,
                    "name" : "rule1",
                    "ruleDetails" : [
                      {
                        "id" : 3,
                        "requiredAnswerId" : 1
                      },
                      {
                        "id" : 4,
                        "requiredAnswerId" : 2
                      },
                      {
                        "id" : 5,
                        "requiredAnswerId" : 3
                      }
                    ]
                  }
                }
              ]
            }
          }
        }
      }
    ]

編輯 1

terms_set可以用作替代方案。 與腳本查詢相比會更快

返回在提供的字段中包含最少數量的確切術語的文檔。

minimum_should_match_script- 數組大小可用於匹配傳遞值的最小數量。

詢問：

{
  "query": {
    "nested": {
      "path": "rules",
      "query": {
        "bool": {
          "filter": {
            "terms_set": {
              "rules.ruleDetails.requiredAnswerId": {
                "terms": [
                  1,
                  2,
                  3
                ],
                "minimum_should_match_script": {
                  "source": "doc['rules.ruleDetails.requiredAnswerId'].size()"
                }
              }
            }
          }
        }
      },
      "inner_hits": {}
    }
  }
}

Answer 2

在玩了一段時間 ES 並閱讀了它的文檔后，我發現你應該記住，提供的script應該被編譯並應用於文檔，因此如果你只知道應該提前匹配的所需數字元素，它會更慢.

因此，我創建了一個單獨的字段requiredMatches來存儲每個文檔的rules.ruleDetails.requiredAnswerId元素的數量，並在索引文檔之前計算它。 然后，我沒有在搜索查詢中使用minimum_should_match_script ，而是使用minimum_should_match_field ：

{
  "query": {
    "nested": {
      "path": "rules",
      "query": {
        "bool": {
          "filter": {
            "terms_set": {
              "rules.ruleDetails.requiredAnswerId": {
                "terms": [
                  1,
                  2,
                  3
                ],
                "minimum_should_match_field": "requiredMatches"
              }
            }
          }
        }
      },
      "inner_hits": {}
    }
  }
}

我使用以下示例作為參考

Elasticsearch：按數組過濾文檔，請求中包含所有文檔數組元素

問題描述

2 個解決方案

解決方案1
1 已采納 2020-04-28 02:53:18

解決方案2
0 2020-07-06 12:05:19

Elasticsearch：按數組過濾文檔，請求中包含所有文檔數組元素

問題描述

2 個解決方案

解決方案1 1 已采納 2020-04-28 02:53:18

解決方案2 0 2020-07-06 12:05:19

解決方案1
1 已采納 2020-04-28 02:53:18

解決方案2
0 2020-07-06 12:05:19