彈性搜索不適用於具有特殊字符'^（插入符號）'

Question

問題是具有增強算子“^（插入符號）”的任何字符序列都不返回任何搜索結果。

但是根據下面的彈性搜索文檔

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#_reserved_characters

- && || ！ （）{} [] ^“〜*？：\\字符可以使用\\符號進行轉義。

要求在彈性搜索中使用n-gram分析器進行包含搜索。

下面是示例用例的映射結構和

{
  "settings": {
    "index": {
      "analysis": {
        "analyzer": {
          "nGram_analyzer": {
            "filter": [
              "lowercase",
              "asciifolding"
            ],
            "type": "custom",
            "tokenizer": "ngram_tokenizer"
          },
          "whitespace_analyzer": {
            "filter": [
              "lowercase",
              "asciifolding"
            ],
            "type": "custom",
            "tokenizer": "whitespace"
          }
        },
        "tokenizer": {
          "ngram_tokenizer": {
            "token_chars": [
              "letter",
              "digit",
              "punctuation",
              "symbol"
            ],
            "min_gram": "2",
            "type": "nGram",
            "max_gram": "20"
          }
        }
      }
    }
  },
  "mappings": {
    "employee": {
      "properties": {
        "employeeName": {
          "type": "string",
          "analyzer": "nGram_analyzer",
          "search_analyzer": "whitespace_analyzer"
        }
      }
    }
  }
}

擁有如下所示的員工姓名，其中包含特殊字符xyz％^＆*

此外，用於包含搜索的示例查詢如下所示

GET
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "employeeName": {
              "query": "xyz%^",
              "type": "boolean",
              "operator": "or"
            }
          }
        }
      ]
    }
  }
}

即使我們試圖逃脫“查詢”： “xyz％\\ ^”它的錯誤。 所以不能搜索包含“^（插入符號）”的搜索的任何字符

任何幫助是極大的贊賞。

Answer 1

ngram tokenizer與問題相關的錯誤。

基本上^不被ngram-tokenizer視為Symbol |Letter |Punctuation 。 結果它標記了^上的輸入。

示例:( url encoded xyz％^）：

GET <index_name>/_analyze?tokenizer=ngram_tokenizer&text=xyz%25%5E

上面的分析api的結果顯示沒有^ ，如下面的響應中所示：

{
   "tokens": [
      {
         "token": "xy",
         "start_offset": 0,
         "end_offset": 2,
         "type": "word",
         "position": 0
      },
      {
         "token": "xyz",
         "start_offset": 0,
         "end_offset": 3,
         "type": "word",
         "position": 1
      },
      {
         "token": "xyz%",
         "start_offset": 0,
         "end_offset": 4,
         "type": "word",
         "position": 2
      },
      {
         "token": "yz",
         "start_offset": 1,
         "end_offset": 3,
         "type": "word",
         "position": 3
      },
      {
         "token": "yz%",
         "start_offset": 1,
         "end_offset": 4,
         "type": "word",
         "position": 4
      },
      {
         "token": "z%",
         "start_offset": 2,
         "end_offset": 4,
         "type": "word",
         "position": 5
      }
   ]
}

因為'^'沒有索引，所以沒有匹配

彈性搜索不適用於具有特殊字符'^（插入符號）'

問題描述

1 個解決方案

解決方案1
2 2016-06-14 15:06:36

彈性搜索不適用於具有特殊字符&#39;^（插入符號）&#39;

問題描述

1 個解決方案

解決方案1 2 2016-06-14 15:06:36

彈性搜索不適用於具有特殊字符'^（插入符號）'

解決方案1
2 2016-06-14 15:06:36