[英]Elasticsearch exclude documents containing specific terms
我已經在elasticsearch
索引了像波紋管這樣的文檔。
{
"category": "clothing (f)",
"description": "Women's Unstoppable Graphic T-Shirt - Women’s Short Sleeve Shirt",
"name": "Women's Unstoppable Graphic T-Shirt",
"price": "$34.99"
}
有clothing (m)
、 clothing (f)
等類別。如果搜索的是女性物品,我試圖排除cloting (m)
類別物品。 我正在嘗試的查詢是:
{
"query": {
"bool": {
"must": [
{
"match": {
"description": "women's black shirt"
}
}
],
"must_not": [
{
"term": {
"category": "clothing (m)"
}
}
]
}
},
"from": 0,
"size": 50
}
但這並沒有按預期工作。 clothing (m)
文件與其他文件的結果總是很少。 如何排除具有特定類別的文檔?
為了排除特定term
(完全匹配),您必須使用keyword
數據類型。
關鍵字數據類型通常用於過濾(查找已發布狀態的所有博客文章)、排序和聚合。 關鍵字字段只能按其確切值進行搜索。
您當前的查詢在結果中捕獲了服裝 (m) ,因為當您為文檔編制索引時,它們會使用 elasticsearch standard
分析器進行分析,該分析器將服裝 (m)分析為服裝和(m) 。
在您的查詢中,您將category
搜索為text
數據類型。
文本數據類型字段被分析,也就是說,它們在被索引之前通過分析器將字符串轉換為單個術語的列表。
運行此命令:
POST my_index/_analyze
{
"text": ["clothing (m)"]
}
結果:
{
"tokens" : [
{
"token" : "clothing",
"start_offset" : 0,
"end_offset" : 8,
"type" : "<ALPHANUM>",
"position" : 0
},
{
"token" : "m",
"start_offset" : 10,
"end_offset" : 11,
"type" : "<ALPHANUM>",
"position" : 1
}
]
}
一個工作示例:
假設您的映射如下所示:
{
"my_index" : {
"mappings" : {
"properties" : {
"category" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"description" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"price" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
讓我們發布一些文件:
POST my_index/_doc/1
{
"category": "clothing (m)",
"description": "Women's Unstoppable Graphic T-Shirt - Women’s Short Sleeve Shirt",
"name": "Women's Unstoppable Graphic T-Shirt",
"price": "$34.99"
}
POST my_index/_doc/2
{
"category": "clothing (f)",
"description": "Women's Unstoppable Graphic T-Shirt - Women’s Short Sleeve Shirt",
"name": "Women's Unstoppable Graphic T-Shirt",
"price": "$34.99"
}
現在我們的查詢應該是這樣的:
GET my_index/_search
{
"query": {
"bool": {
"must": {
"match": {
"description": "women's black shirt"
}
},
"filter": {
"bool": {
"must_not": {
"term": {
"category.keyword": "clothing (m)"
}
}
}
}
}
},
"from": 0,
"size": 50
}
結果:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.43301374,
"hits" : [
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.43301374,
"_source" : {
"category" : "clothing (f)",
"description" : "Women's Unstoppable Graphic T-Shirt - Women’s Short Sleeve Shirt",
"name" : "Women's Unstoppable Graphic T-Shirt",
"price" : "$34.99"
}
}
]
}
}
不使用keyword
結果
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 0.43301374,
"hits" : [
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.43301374,
"_source" : {
"category" : "clothing (f)",
"description" : "Women's Unstoppable Graphic T-Shirt - Women’s Short Sleeve Shirt",
"name" : "Women's Unstoppable Graphic T-Shirt",
"price" : "$34.99"
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.43301374,
"_source" : {
"category" : "clothing (m)",
"description" : "Women's Unstoppable Graphic T-Shirt - Women’s Short Sleeve Shirt",
"name" : "Women's Unstoppable Graphic T-Shirt",
"price" : "$34.99"
}
}
]
}
}
正如您從上次結果中看到的,我們還得到了服裝 (m) 。 順便說一句,不要對text
數據類型使用term
。 使用match
。
希望這可以幫助。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.