[英]Elasticsearch - How to use multiple analyzers in a query
我想在查詢中實現同義詞和停用詞過濾器。 為此,我創建了兩個分析器,兩個分析器都可以正常工作。 但是我要同時使用它們,怎么辦?
GET my_index/_search/
{
"query": {
"match": {
"_all": {
"query": "Good and Bad",
"analyzer": [
"stop_analyzer",
"synonym"
]
}
}
}
}
上面的查詢拋出一個錯誤:
{
"error": {
"root_cause": [
{
"type": "parsing_exception",
"reason": "[match] unknown token [START_ARRAY] after [analyzer]",
"line": 6,
"col": 26
}
],
"type": "parsing_exception",
"reason": "[match] unknown token [START_ARRAY] after [analyzer]",
"line": 6,
"col": 26
},
"status": 400
}
我想我不能在其中使用數組或對象,因為當我使用單個分析器(例如"analyzer": "stop_analyzer"
或"analyzer": "synonym"
它的效果很好。 所以我的問題是如何同時使用兩者?
您可以定義一個定制分析器 ,該分析器可以將這兩個簡單的分析器組合為一個復雜的分析器。
假設您通過以下方式創建了索引:
PUT my_index
{
"settings": {
"index": {
"analysis": {
"analyzer": {
"stopwordsSynonym": {
"filter": [
"lowercase",
"my_synonym",
"english_stop"
],
"tokenizer": "standard"
}
},
"filter": {
"english_stop": {
"type": "stop",
"stopwords": "_english_"
},
"my_synonym": {
"type": "synonym",
"synonyms": [
"nice => good",
"poor => bad"
]
}
}
}
}
},
"mappings": {
"my_type": {
"properties": {
"my_text": {
"type": "text",
"analyzer": "stopwordsSynonym"
}
}
}
}
}
並添加一條記錄:
POST my_index/my_type
{
"my_text": "People aren’t born good or bad. Maybe they’re born with tendencies either way, but it’s the way you live your life that matters."
}
現在默認情況下,對my_text
的搜索將使用stopwordsSynonym
分析器。 此查詢將匹配文檔,因為nice
是good
的同義詞:
GET my_index/_search
{
"query": {
"match": {
"my_text": "nice and ugly"
}
}
}
您也可以像這樣測試分析儀:
GET my_index/_analyze
{
"analyzer": "stopwordsSynonym",
"text": "nice or ugly"
}
{
"tokens": [
{
"token": "good",
"start_offset": 0,
"end_offset": 4,
"type": "SYNONYM",
"position": 0
},
{
"token": "ugly",
"start_offset": 8,
"end_offset": 12,
"type": "<ALPHANUM>",
"position": 2
}
]
}
將此與standard
分析儀輸出進行比較:
GET my_index/_analyze
{
"analyzer": "standard",
"text": "nice or ugly"
}
{
"tokens": [
{
"token": "nice",
"start_offset": 0,
"end_offset": 4,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "or",
"start_offset": 5,
"end_offset": 7,
"type": "<ALPHANUM>",
"position": 1
},
{
"token": "ugly",
"start_offset": 8,
"end_offset": 12,
"type": "<ALPHANUM>",
"position": 2
}
]
}
事實上, stopwordsSynonym
更換nice
標記具有good
(和它的type
是SYNONYM
),並刪除or
從憑證清單,因為它是一種常見的英文停用詞。
為了對給定查詢使用不同的分析器,可以使用query_string
查詢:
GET /_search
{
"query": {
"query_string": {
"query": "my_text:nice and poor",
"analyzer": "stopwordsSynonym"
}
}
}
或match_phrase
查詢:
GET my_index/_search
{
"query": {
"match_phrase" : {
"my_standard_text" : {
"query" : "nice and poor",
"analyzer": "stopwordsSynonym"
}
}
}
}
無論如何,應該在創建時將analyzer
添加到索引的設置中(請參閱答案的開頭)。
看看搜索分析器 ,它允許使用不同的分析器進行搜索。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.