简体   繁体   English

如何过滤 Elasticsearch 中的多个字符

[英]How to filter multiple characters in Elasticsearch

Is there a way how to filter multiple characters during analyzing in ElastisSearch?有没有办法在 ElastisSearch 分析过程中过滤多个字符? We would like to setup it so if user searches ' botled ' then he get the documents that include ' bottled ' or ' botttled ', etc., ie no matter double, tripple letters.我们想设置它,如果用户搜索“瓶装”,那么他会得到包括“瓶装”或“瓶装”等的文件,即无论是双字母还是三字母。

I have looking for solution in token filters https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-tokenfilters.html , but it seems that none of them matches our requirements.我一直在寻找令牌过滤器https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-tokenfilters.html的解决方案,但似乎它们都不符合我们的要求。

by default elasticsearch text field is tokenized based on whitespace, ie only words are indexed and are searchable.默认情况下 elasticsearch 文本字段基于空格进行标记,即只有单词被索引并且是可搜索的。


would regex search work for you? 正则表达式搜索对你有用吗?
 GET /_search { "query": { "regexp": { "user": { "value": "b+o+t+t+l+e+d+" } } } }

b+ --> one or more occurrence of b b+ --> 出现一次或多次 b

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM