简体繁体 English

Elasticsearch中的邻近搜索突出显示

[英]proximity search highlight in elasticsearch

原文 2017-12-25 22:56:11 4 1 elasticsearch

When doing proximity search using match_phrase with 'slop', the highlight tags appear on each word separately so I cannot know where was the match. 当使用match_phrase和'slop'进行邻近搜索时，高亮标签分别出现在每个单词上，所以我不知道匹配在哪里。

For example, if I search for the phrase "quick fox" in the text "a quick brown fox" with slop=1, I will get the result with 'em' tags on "quick" and "fox" like that: quick brown fox 例如，如果我在slop = 1的文本“ a quick brown fox”中搜索短语“ quick fox”，则将在“ quick”和“ fox”上获得带有“ em”标签的结果，如下所示： quick brown 狐狸

What I need is the entire "quick brown fox" emphasized, (connected from the first to the last found words, ie the sequence of words that satisfied the query). 我需要的是强调的整个“快速的棕色狐狸”（从发现的第一个单词到最后一个单词，即满足查询条件的单词序列）。
To find it manually can be complicated when the match_phrase contains many words and found many times in the text. 当match_phrase包含许多单词并且在文本中多次查找时，手动查找可能会很复杂。 Is there any way to set elasticsearch to return this? 有什么办法设置elasticsearch返回此值？

1 个解决方案

Elasticsearch tokenizes both your document and your search (splitting it on space basically for the example you give). Elasticsearch对您的文档和搜索都进行标记化（对于您给出的示例，基本上在空间上将其分割）。 The match_phrase will make sure those search tokens are found in that order in your document tokens. match_phrase将确保在您的文档令牌中match_phrase顺序找到这些搜索令牌。 The highlighter will then highlight each of those tokens. 荧光笔然后将突出显示每个标记。 I think it would be very hard to do what you want solely with Elastiscearch. 我认为仅使用Elastiscearch很难完成您想要的操作。