如何通過數組中的計數過濾 ElasticSearch 中的文檔？

Question

我有包含關鍵字數組的文檔，例如：

tags : ['one', 'two', 'three', 'one']

有沒有辦法過濾文檔，所以我只取 doc，其中“一個”出現兩次，而忽略 doc，在哪里只出現一次？

GET /_search
{
  "size": 0,
  "aggs" : {
    "tags" : {
      "filters" : {
        "filters" : {
          "grp1" :  {"bool" : {"must" : [
              {"term" : { "tags" : "one" } } // add condition appear twice in array
            ]
          }},
          "grp2" :  {"bool" : {"must" : [
              {"term" : { "tags" : "two" } },
              {"term" : { "tags" : "three" } }            
            ]
          }},
        }
      }
    }
  }
}

Answer 1

如果您定期存儲數組（不是作為關鍵字），您可以使用具有足夠高斜率的匹配短語查詢，以便它匹配數組索引中任何 position 的值。

    PUT test_index/_doc/1
    {
      "tags" : ["one", "two", "three", "one"]
    }
    
    PUT test_index/_doc/2
    {
      "tags" : ["one", "two", "three"]
    }

    GET test_index/_search
    {
      "query": {
        "match_phrase": {
          "tags":{
            "query": "one one",
            "slop": 1000
          }
        }
      }
    }

這里我使用 1000 的斜率，因為數組索引中的每個項目都以 100+ position 開頭。 所以我們需要一個足夠高的斜率來匹配一個短語在 0 position 和另一個“一個”在 303 position 處。 嘗試運行以下查詢並查看每個項目的位置

GET test_index/_analyze
    {
      "analyzer": "standard",
      "text": ["one", "two", "three", "one"]
    }

如何通過數組中的計數過濾 ElasticSearch 中的文檔？

問題描述

1 個解決方案

解決方案1
0 2021-03-17 09:22:01

如何通過數組中的計數過濾 ElasticSearch 中的文檔？

問題描述

1 個解決方案

解決方案1 0 2021-03-17 09:22:01

解決方案1
0 2021-03-17 09:22:01