簡體   English   中英

Elasticsearch-如何在查詢中使用多個分析器

[英]Elasticsearch - How to use multiple analyzers in a query

我想在查詢中實現同義詞停用詞過濾器。 為此,我創建了兩個分析器,兩個分析器都可以正常工作。 但是我要同時使用它們,怎么辦?

GET my_index/_search/
{
    "query": {
        "match": {
           "_all": {
             "query": "Good and Bad",
             "analyzer": [
                 "stop_analyzer",
                 "synonym"
             ]
           }
        }
    }
}

上面的查詢拋出一個錯誤:

{
   "error": {
      "root_cause": [
         {
            "type": "parsing_exception",
            "reason": "[match] unknown token [START_ARRAY] after [analyzer]",
            "line": 6,
            "col": 26
         }
      ],
      "type": "parsing_exception",
      "reason": "[match] unknown token [START_ARRAY] after [analyzer]",
      "line": 6,
      "col": 26
   },
   "status": 400
}

我想我不能在其中使用數組或對象,因為當我使用單個分析器(例如"analyzer": "stop_analyzer""analyzer": "synonym"它的效果很好。 所以我的問題是如何同時使用兩者?

您可以定義一個定制分析器 ,該分析器可以將這兩個簡單的分析器組合為一個復雜的分析器。

定義自定義分析器

假設您通過以下方式創建了索引:

PUT my_index
{  
  "settings": {
    "index": {
      "analysis": {
        "analyzer": {
          "stopwordsSynonym": {
            "filter": [
              "lowercase",
              "my_synonym",
              "english_stop"
            ],
            "tokenizer": "standard"
          }
        },
        "filter": {
          "english_stop": {
            "type": "stop",
            "stopwords": "_english_"
          },
          "my_synonym": {
            "type": "synonym",
            "synonyms": [
              "nice => good",
              "poor => bad"  
            ]
          }
        }
      }
    }
  },
  "mappings": {
    "my_type": {
        "properties": {
            "my_text": {
                "type": "text",
                "analyzer": "stopwordsSynonym"
            }
        }
    }
  }
}

並添加一條記錄:

POST my_index/my_type
{
    "my_text": "People aren’t born good or bad. Maybe they’re born with tendencies either way, but it’s the way you live your life that matters."
}

現在默認情況下,對my_text的搜索將使用stopwordsSynonym分析器。 此查詢將匹配文檔,因為nicegood的同義詞:

GET my_index/_search
{
    "query": {
        "match": {
            "my_text": "nice and ugly"
        }
    }
}

測試自定義分析儀

您也可以像這樣測試分析儀:

GET my_index/_analyze 
{
  "analyzer": "stopwordsSynonym", 
  "text":     "nice or ugly"
}

{
   "tokens": [
      {
         "token": "good",
         "start_offset": 0,
         "end_offset": 4,
         "type": "SYNONYM",
         "position": 0
      },
      {
         "token": "ugly",
         "start_offset": 8,
         "end_offset": 12,
         "type": "<ALPHANUM>",
         "position": 2
      }
   ]
}

將此與standard分析儀輸出進行比較:

GET my_index/_analyze 
{
  "analyzer": "standard", 
  "text":     "nice or ugly"
}

{
   "tokens": [
      {
         "token": "nice",
         "start_offset": 0,
         "end_offset": 4,
         "type": "<ALPHANUM>",
         "position": 0
      },
      {
         "token": "or",
         "start_offset": 5,
         "end_offset": 7,
         "type": "<ALPHANUM>",
         "position": 1
      },
      {
         "token": "ugly",
         "start_offset": 8,
         "end_offset": 12,
         "type": "<ALPHANUM>",
         "position": 2
      }
   ]
}

事實上, stopwordsSynonym更換nice標記具有good (和它的typeSYNONYM ),並刪除or從憑證清單,因為它是一種常見的英文停用詞。

定義分析器查詢

為了對給定查詢使用不同的分析器,可以使用query_string查詢:

GET /_search
{
    "query": {
        "query_string": {
            "query": "my_text:nice and poor",
            "analyzer": "stopwordsSynonym"
        }
    }
}

match_phrase查詢:

GET my_index/_search
{
    "query": {
        "match_phrase" : {
            "my_standard_text" : {
                "query" : "nice and poor",
                "analyzer": "stopwordsSynonym"
            }
        }
    }
}

無論如何,應該在創建時將analyzer添加到索引的設置中(請參閱答案的開頭)。

看看搜索分析器 ,它允許使用不同的分析器進行搜索。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM