简体   繁体   English

弹性搜索分析器和构面

[英]Elastic Search Analyzers and Facets

I am evaluating Elastic Search for a client. 我正在评估客户端的弹性搜索。 I have begun playing with the API and succesfully created an index and added documents to the search. 我已经开始使用API​​并成功创建了索引并在搜索中添加了文档。 The main reason for using Elastic Search is that it provides facets functionality. 使用弹性搜索的主要原因是它提供了facet功能。

I am having trouble understanding Analyzers, Tokenizers and Filters and how do they fit in with facets. 我无法理解分析器,标记器和过滤器,以及它们如何适应各个方面。 I want to be able to use keywords, dates, search terms, etc as my facets. 我希望能够使用关键字,日期,搜索字词等作为我的方面。

How would I go about incorporating Analyzers into my search and how can I use it with facets? 我如何将分析器纳入我的搜索中,如何将其与facet一起使用?

When Elastic Search indexes a string by default, usually it breaks them up into tokens, for example: "Fox jump over the wall" will be tokenized into individual words as "Fox", "jump", "over", "the", "wall". 当弹性搜索默认索引字符串时,通常会将它们分解为标记,例如:“Fox跳过墙壁”将被标记为单个单词,如“Fox”,“jump”,“over”,“the”, “壁”。

So what does this do? 那这是做什么的呢? If you were to search through your documents using the Lucene Query, you may not get the string that you want because Elastic Search will automatically search for tokenized words instead of the entire string, thus your search results will be severely affected. 如果您使用Lucene Query搜索文档,则可能无法获得所需的字符串,因为Elastic Search将自动搜索标记化的单词而不是整个字符串,因此您的搜索结果将受到严重影响。

For example, if you search for "Fox jump over the wall", you will not get you any result. 例如,如果您搜索“Fox跳过墙壁”,您将无法获得任何结果。 Searching for "Fox" instead will get you a result. 搜索“福克斯”代替将获得结果。

The Analyze API or the analyze term tells Elastic Search not to tokenize the indexed string, so that you can properly search for exact strings, which is particularly useful when you want to do statistical facets on entire strings. Analyze API或分析术语告诉Elastic Search 不要对索引字符串进行标记化,以便您可以正确搜索确切的字符串,这在您想要对整个字符串执行统计方面时特别有用。

Tokenizers just tokenize strings into individual words and stores them in Elastic Search. 断词只是标记化的字符串到弹性搜索单个单词,并将它们存储。 As mentioned, these tokens can be queried against using the Search API. 如上所述,可以使用Search API查询这些令牌。

Filters create a subset of your queried result under specific conditions which you specify, thus helping you separate what you need from what you do not need in your search results. 过滤器会在您指定的特定条件下创建查询结果的子集,从而帮助您将所需内容与搜索结果中不需要的内容分开。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM