简体   繁体   English

使用 Whitespace Tokeniser 的弹性搜索空白分析器和客户分析器

[英]Elastic search Whitespace Analyser and customer analyser with Whitespace Tokeniser

I have a custom analyser 'default' in elastic search indexing template with tokeniser as "whitespace".我在弹性搜索索引模板中有一个自定义分析器“默认”,标记器为“空白”。 Can I use elastic search in-build Whitespace analyser for this purpose as my default analyser with Whitespace tokeniser and in-build Whitespace analyser will do the same task.我可以为此目的使用弹性搜索内置空白分析器,因为我的默认分析器与空白标记器和内置空白分析器将执行相同的任务。 In general, which is better to use?一般来说,哪个更好用? Will there be any performance impact?会不会对性能产生影响?

"analysis": {
   "analyzer": {
        "default": {
          "tokenizer": "whitespace"
        }
    }
}

Both the whitespace tokenizer and whitespace analyzer are built-in in elasticsearch 空白标记器和空白分析器都内置在 elasticsearch 中

GET /_analyze
{
  "analyzer" : "whitespace",
  "text" : "multi grain bread"
}

Following tokens are generated生成以下令牌

{
  "tokens": [
    {
      "token": "multi",
      "start_offset": 0,
      "end_offset": 6,
      "type": "word",
      "position": 0
    },
    {
      "token": "grain",
      "start_offset": 7,
      "end_offset": 12,
      "type": "word",
      "position": 1
    },
    {
      "token": "bread",
      "start_offset": 13,
      "end_offset": 18,
      "type": "word",
      "position": 2
    }
  ]
}

You can use any of them if you only want to break the text when any whitespace comes.如果您只想在出现任何空格时中断文本,则可以使用其中任何一个。 However, when you need to modify the whitespace , you should use the whitespace tokenizer and the filters you want to add.但是,当您需要修改whitespace时,您应该使用whitespace tokenizer和要添加的filters This is because you can not modify the whitespace analyzer.这是因为您不能修改whitespace分析器。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM