简体   繁体   English

使用多个单词的Apache Solr通配符搜索

[英]Apache solr wild card searching with multiple words

We are using apache solr with php. 我们正在使用apache solr和php。

There is a problem in wild card searching. 通配符搜索中存在问题。

We want to search "project manage*" which can list possible results like project manager, project management etc. However, it is not working whenever there are two words in wild card searching 我们想搜索“项目管理*”,它可以列出可能的结果,例如项目经理,项目管理等。但是,只要通配符搜索中有两个单词,它就无法工作

For example "projectmanage*" is working whereas "proejct manage*" is not working. 例如,“ projectmanage *”在工作,而“ proejct manage *”在行。 We also tried by escaping the space but it is not working either.. 我们也尝试过逃避空间,但是它也不起作用。

Looking forward to all valuable inputs.. thanks in advance. 期待所有宝贵的意见。

When applying a wild card, the regular analysis chain is not performed when querying. 应用通配符时,查询时不执行常规分析链。 This results in Solr looking for tokens starting with with "project manage" - and if you have an analysis chain when indexing, your text is usually split into multiple tokens. 这样一来,Solr就会以“项目管理”开始查找令牌-如果在建立索引时具有分析链,则通常会将文本拆分成多个令牌。

You can use a Shingle filter to index multiple tokens as a single token, which can be used to get around the issue (be sure to use the same separator as you use in your text). 您可以使用Shingle筛选器将多个标记作为单个标记编制索引 ,该标记可用于解决问题(请确保使用与文本中相同的分隔符)。

Another option is to lowercase the field when indexing and querying and use a regular StrField which isn't processed in any way, or use a KeywordTokenizer - which keeps the indexed content as a single token. 另一种选择是在索引和查询时小写该字段,并使用未经任何方式处理的常规StrField,或使用KeywordTokenizer-将索引内容保留为单个标记。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM