简体   繁体   English

如何使用空格实现Solr不区分大小写和不区分重音的子字符串搜索?

[英]How can I implement Solr case insensitive and accent insensitive substring search with whitespaces?

I store 120000 wine records in a SQL Server database. 我在SQL Server数据库中存储了120000条酒记录。 Until now I've searched successfully for wine names by performing the following SQL: 到目前为止,我已经通过执行以下SQL成功搜索了葡萄酒名称:

WHERE (LOWER(Wine.name) LIKE '%" + (searchString) + "%'")

I am now in the process of switching over to using Solr. 我现在正在切换到使用Solr。 I would like to search for "clos rene" and get "Clos Réné" back. 我想搜索“ clos rene”并重新获得“ ClosRéné”。 However Solr is returning all records that match 'Clos' and all records that match 'Réné'. 但是,Solr将返回所有与“ Clos”匹配的记录以及所有与“Réné”匹配的记录。 I've have tried the following field definition: 我已经尝试过以下字段定义:

<fieldType name="c_text" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.KeywordTokenizerFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.KeywordTokenizerFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
</fieldType>

Could someone please help me define the correct field type so that I can reproduce my SQL query above to return case insensitive and accent insensitive results for multiple words with white space in between? 有人可以帮我定义正确的字段类型,以便我可以重现上面的SQL查询,以返回空格之间不区分大小写的不区分大小写和重音的结果吗?

I have also experimented with wildcard searches using filed type 'string', but I can't get it to work as case-insensitive. 我也尝试过使用文件类型'string'进行通配符搜索,但是我不能使其不区分大小写。

Try, 尝试,

<fieldType name="c_text" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.KeywordTokenizerFactory"/>
   <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.ASCIIFoldingFilterFactory"/>
    <filter class="solr.EdgeNGramFilterFactory" minGramSize="3" maxGramSize="50" side="front"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.KeywordTokenizerFactory"/>
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.ASCIIFoldingFilterFactory"/>
  </analyzer>
</fieldType>

EDIT: Ok now i get your question , added extra : <filter class="solr.EdgeNGramFilterFactory" minGramSize="3" maxGramSize="50" side="front"/> try this. 编辑:好的,现在我得到您的问题,添加了额外的内容: <filter class="solr.EdgeNGramFilterFactory" minGramSize="3" maxGramSize="50" side="front"/>尝试一下。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM