簡體   English   中英

Elasticsearch 排除包含特定術語的文檔

[英]Elasticsearch exclude documents containing specific terms

我已經在elasticsearch索引了像波紋管這樣的文檔。

{    
    "category": "clothing (f)",
    "description": "Women's Unstoppable Graphic T-Shirt - Women’s Short Sleeve Shirt",
    "name": "Women's Unstoppable Graphic T-Shirt",
    "price": "$34.99"
}

clothing (m)clothing (f)等類別。如果搜索的是女性物品,我試圖排除cloting (m)類別物品。 我正在嘗試的查詢是:

{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "description": "women's black shirt"
          }
        }
      ],
      "must_not": [
        {
          "term": {
            "category": "clothing (m)"
          }
        }
      ]
    }
  },
  "from": 0,
  "size": 50
}

但這並沒有按預期工作。 clothing (m)文件與其他文件的結果總是很少。 如何排除具有特定類別的文檔?

為了排除特定term (完全匹配),您必須使用keyword數據類型。

關鍵字數據類型通常用於過濾(查找已發布狀態的所有博客文章)、排序和聚合。 關鍵字字段只能按其確切值進行搜索。

關鍵字數據類型

您當前的查詢在結果中捕獲了服裝 (m) ,因為當您為文檔編制索引時,它們會使用 elasticsearch standard分析器進行分析,該分析器將服裝 (m)分析為服裝(m)

在您的查詢中,您將category搜索為text數據類型。

文本數據類型字段被分析,也就是說,它們在被索引之前通過分析器將字符串轉換為單個術語的列表。

運行此命令:

POST my_index/_analyze
{
  "text": ["clothing (m)"]
}

結果:

{
  "tokens" : [
    {
      "token" : "clothing",
      "start_offset" : 0,
      "end_offset" : 8,
      "type" : "<ALPHANUM>",
      "position" : 0
    },
    {
      "token" : "m",
      "start_offset" : 10,
      "end_offset" : 11,
      "type" : "<ALPHANUM>",
      "position" : 1
    }
  ]
}

一個工作示例:

假設您的映射如下所示:

{
 "my_index" : {
    "mappings" : {
      "properties" : {
        "category" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "description" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "name" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "price" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
      }
    }
  }
}

讓我們發布一些文件:

POST my_index/_doc/1
{    
    "category": "clothing (m)",
    "description": "Women's Unstoppable Graphic T-Shirt - Women’s Short Sleeve Shirt",
    "name": "Women's Unstoppable Graphic T-Shirt",
    "price": "$34.99"
}


POST my_index/_doc/2
{    
    "category": "clothing (f)",
    "description": "Women's Unstoppable Graphic T-Shirt - Women’s Short Sleeve Shirt",
    "name": "Women's Unstoppable Graphic T-Shirt",
    "price": "$34.99"
}

現在我們的查詢應該是這樣的:

GET my_index/_search
{
  "query": {
    "bool": {
      "must": {
        "match": {
          "description": "women's black shirt"
        }
      },
      "filter": {
        "bool": {
          "must_not": {
            "term": {
              "category.keyword": "clothing (m)"
            }
          }
        }
      }
    }
  },
  "from": 0,
  "size": 50
}

結果:

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.43301374,
    "hits" : [
      {
        "_index" : "my_index",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.43301374,
        "_source" : {
          "category" : "clothing (f)",
          "description" : "Women's Unstoppable Graphic T-Shirt - Women’s Short Sleeve Shirt",
          "name" : "Women's Unstoppable Graphic T-Shirt",
          "price" : "$34.99"
        }
      }
    ]
  }
}

不使用keyword結果

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 0.43301374,
    "hits" : [
      {
        "_index" : "my_index",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.43301374,
        "_source" : {
          "category" : "clothing (f)",
          "description" : "Women's Unstoppable Graphic T-Shirt - Women’s Short Sleeve Shirt",
          "name" : "Women's Unstoppable Graphic T-Shirt",
          "price" : "$34.99"
        }
      },
      {
        "_index" : "my_index",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 0.43301374,
        "_source" : {
          "category" : "clothing (m)",
          "description" : "Women's Unstoppable Graphic T-Shirt - Women’s Short Sleeve Shirt",
          "name" : "Women's Unstoppable Graphic T-Shirt",
          "price" : "$34.99"
        }
      }
    ]
  }
}

正如您從上次結果中看到的,我們還得到了服裝 (m) 順便說一句,不要對text數據類型使用term 使用match

希望這可以幫助。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM