简体   繁体   English

Apache Solr使用(*)自动搜索

[英]Apache Solr automatically search with (*)

Good evening, 晚上好,

when I search for the word "app" it dont show the word "apple". 当我搜索单词“ app”时,它不会显示单词“ apple”。 But if I search for "app*", it show "apple" and "app". 但是,如果我搜索“ app *”,它将显示“ apple”和“ app”。 I dont want to write "*" in the search bar. 我不想在搜索栏中写“ *”。 How can I do this if I only search for "app" and it shows "apple" and "app"? 如果我仅搜索“ app”并且显示“ apple”和“ app”,该怎么办?

  <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100" multiValued="true">
<analyzer type="index">
  <tokenizer class="solr.StandardTokenizerFactory"/>
  <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
  <filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
  <tokenizer class="solr.StandardTokenizerFactory"/>
  <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
  <filter class="solr.SynonymFilterFactory" expand="true" ignoreCase="true" synonyms="synonyms.txt"/>
  <filter class="solr.LowerCaseFilterFactory"/>
</analyzer>

I tried to add <filter class="solr.ReversedWildcardFilterFactory"/> but it didnt work. 我尝试添加<filter class="solr.ReversedWildcardFilterFactory"/>但没有成功。

Can someone help me? 有人能帮我吗?

I use Apache Solr 6.4.1 我使用Apache Solr 6.4.1

Sry for my bad english. 抱歉我的英语不好。

Use EdgeNGramFilterFactory 使用EdgeNGramFilterFactory

EdgeNGramFilterFactory : EdgeNGramFilterFactory:

This filter generates edge n-gram tokens of sizes within the given range. 此过滤器生成大小在给定范围内的边缘n元语法标记。

Arguments: 参数:

  • minGramSize: (integer, default 1) The minimum gram size. minGramSize :(整数,默认为1)最小克大小。
  • maxGramSize: (integer, default 1) The maximum gram size. maxGramSize :(整数,默认为1)最大克大小。

Example : 范例:

If we use minGramSize = 1 and maxGramSize = 4 then 如果我们使用minGramSize = 1和maxGramSize = 4,则

In: "four score" 中:“四分”
Tokenizer to Filter: "four", "score" 令牌过滤器:“四个”,“得分”
Out: "f", "fo", "fou", "four", "s", "sc", "sco", "scor" 输出:“ f”,“ fo”,“ fou”,“四个”,“ s”,“ sc”,“ sco”,“ scor”

For your case you can use the below schema : 对于您的情况,可以使用以下架构:

<fieldType name="text_ngram" class="solr.TextField" positionIncrementGap="100">
    <analyzer type="index">
      <tokenizer class="solr.StandardTokenizerFactory"/>
      <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
      <filter class="solr.LowerCaseFilterFactory" />
      <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="200"/>
     </analyzer>
    <analyzer type="query">
      <tokenizer class="solr.StandardTokenizerFactory"/>
      <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
      <filter class="solr.SynonymFilterFactory" expand="true" ignoreCase="true" synonyms="synonyms.txt"/>
      <filter class="solr.LowerCaseFilterFactory" />
    </analyzer>
</fieldType>

And update your fieldType to text_ngram Ex. 并将您的fieldType更新为text_ngram Ex。

<field name="name" type="text_ngram" indexed="true" stored="false" multiValued="true"/>

Note : Don't forget to reload the core and reindex data 注意:不要忘记重新加载核心和重新索引数据

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM