簡體   English   中英

彈性搜索不適用於具有特殊字符'^(插入符號)'

[英]elastic Search not working for having special character '^(caret symbol)'

問題是具有增強算子“^(插入符號)”的任何字符序列都不返回任何搜索結果。

但是根據下面的彈性搜索文檔

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#_reserved_characters

    • && || (){} [] ^“〜*?:\\字符可以使用\\符號進行轉義。

要求在彈性搜索中使用n-gram分析器進行包含搜索。

下面是示例用例的映射結構和

{
  "settings": {
    "index": {
      "analysis": {
        "analyzer": {
          "nGram_analyzer": {
            "filter": [
              "lowercase",
              "asciifolding"
            ],
            "type": "custom",
            "tokenizer": "ngram_tokenizer"
          },
          "whitespace_analyzer": {
            "filter": [
              "lowercase",
              "asciifolding"
            ],
            "type": "custom",
            "tokenizer": "whitespace"
          }
        },
        "tokenizer": {
          "ngram_tokenizer": {
            "token_chars": [
              "letter",
              "digit",
              "punctuation",
              "symbol"
            ],
            "min_gram": "2",
            "type": "nGram",
            "max_gram": "20"
          }
        }
      }
    }
  },
  "mappings": {
    "employee": {
      "properties": {
        "employeeName": {
          "type": "string",
          "analyzer": "nGram_analyzer",
          "search_analyzer": "whitespace_analyzer"
        }
      }
    }
  }
}

擁有如下所示的員工姓名,其中包含特殊字符xyz%^&*

此外,用於包含搜索的示例查詢如下所示

GET
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "employeeName": {
              "query": "xyz%^",
              "type": "boolean",
              "operator": "or"
            }
          }
        }
      ]
    }
  }
}

即使我們試圖逃脫“查詢”: “xyz%\\ ^”它的錯誤。 所以不能搜索包含“^(插入符號)”的搜索的任何字符

任何幫助是極大的贊賞。

ngram tokenizer問題相關的錯誤。

基本上^不被ngram-tokenizer視為Symbol |Letter |Punctuation 結果它標記了^上的輸入。

示例:( url encoded xyz%^):

GET <index_name>/_analyze?tokenizer=ngram_tokenizer&text=xyz%25%5E

上面的分析api的結果顯示沒有^ ,如下面的響應中所示:

{
   "tokens": [
      {
         "token": "xy",
         "start_offset": 0,
         "end_offset": 2,
         "type": "word",
         "position": 0
      },
      {
         "token": "xyz",
         "start_offset": 0,
         "end_offset": 3,
         "type": "word",
         "position": 1
      },
      {
         "token": "xyz%",
         "start_offset": 0,
         "end_offset": 4,
         "type": "word",
         "position": 2
      },
      {
         "token": "yz",
         "start_offset": 1,
         "end_offset": 3,
         "type": "word",
         "position": 3
      },
      {
         "token": "yz%",
         "start_offset": 1,
         "end_offset": 4,
         "type": "word",
         "position": 4
      },
      {
         "token": "z%",
         "start_offset": 2,
         "end_offset": 4,
         "type": "word",
         "position": 5
      }
   ]
}

因為'^'沒有索引,所以沒有匹配

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM