简体   繁体   English

从在弹性搜索上使用正则表达式过滤器开始的正则表达式是什么?

[英]What is the regular expression for starts with using regexp filter on elastic search?

I'm working on a Search Engine using Elastic Search - I'm using its java API. 我正在使用Elastic Search开发搜索引擎-我正在使用其Java API。 And would like to configure a regexp filter for my queries particularly a "starts with" filter. 并想为我的查询配置正则表达式过滤器,尤其是“开头为”过滤器。

Suppose I have these titles in my Index: 假设我的索引中包含以下标题:

  1. the world 世界
  2. things about him 关于他的事情
  3. george's ultimatum 乔治的最后通atum
  4. jumping 跳跃
  5. jimmy and the flock 吉米和羊群

If I would like to get the results exactly starting with the letter t or th, what regular expression should I use? 如果我想得到完全以字母t或th开头的结果,我应该使用什么正则表达式?

CORRECT RESULTS AFTER SEARCH SHOULD BE 搜索后应该得到正确的结果

  1. the world 世界
  2. things about him 关于他的事情

I've tried using: 我试过使用:

^t.*   OR   ^[t.*]

But doesn't return any results. 但不会返回任何结果。 The starting anchor ^ doesn't work on Elastic even though the documentation says so. 即使文档中如此说,起始锚^在Elastic上也不起作用。

t.*   OR   [t.*]

But it works just like the prefix filter, and includes the result "jimmy and the flock" 但它的工作原理与前缀过滤器一样,并包含结果“吉米和羊群”

Note: 注意:

  • I cannot use the regexp query (A limitation of the search engine I'm building) so I'm forced to use only a filter 我无法使用regexp查询(正在构建的搜索引擎的限制),因此我被迫仅使用过滤器
  • I've tried using the prefix filter but it will evaluate terms, using the prefix parameter "t" for example will include the title "jimmy and the flock" because of "the" term 我尝试使用前缀过滤器,但是它将评估条件,例如,使用前缀参数“ t”将包括标题“ jimmy and the flock”,因为“ the”

BTW, I'm using ES version 1.0.0 顺便说一句,我正在使用ES 1.0.0版

There is a special page on the ElasticSearch blog that exactly answers your problem: http://www.elasticsearch.org/blog/starts-with-phrase-matching/ ; 在ElasticSearch博客上有一个特殊的页面可以准确回答您的问题: http ://www.elasticsearch.org/blog/starts-with-phrase-matching/; as pickypg suggests, it is a mapping problem, you must set a special analyzer that combines the "keyword" tokenizer and the "lowercase" filter. 正如pickypg所建议的,这是一个映射问题,您必须设置一个特殊的分析器,该分析器将“关键字”标记器和“小写”过滤器组合在一起。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM