[英]Apache Solr search against different combinations of text entered.
I am new to this Apache Solr. 我是这个Apache Solr的新手。 I want to do a search against different combinations of text entered.
我想对输入的文本的不同组合进行搜索。 For example, if the text is 'hello' , it should return records having hello,llo,hel,ollhe, he and so on..Is this possible with solr ?
例如,如果文本为'hello',则应返回具有hello,llo,hel,ollhe,he等的记录。.solr可以吗? if so, how we can do this?
如果是这样,我们该怎么做? Please help me.
请帮我。
This is possible in solr. 在solr中这是可能的。 You can use the
EdgeNGramFilterFactory
in your fieldType. 您可以在
EdgeNGramFilterFactory
中使用EdgeNGramFilterFactory。 here is the example of it. 这是它的示例。
Here word hello will have tokens like he, hel, hell and hello 在这里,单词hello将具有他,hel,hell和hello之类的标记
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
</analyzer>
</fieldType>
or you can try NGramTokenizerFactory
instead of EdgeNGramFilterFactory
. 或者您可以尝试使用
NGramTokenizerFactory
而不是EdgeNGramFilterFactory
。
<tokenizer class="solr.NGramTokenizerFactory" minGramSize="2" maxGramSize="10"/>
Which will give the output like 这将给输出像
for hello it would generate token like 你好,它将生成令牌
he, hel, hell, hello, el, ell, and so.. 他,hel,地狱,你好,el,ell等。
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<schema name="autoSolrSchema" version="1.5">
<types>
<fieldType class="org.apache.solr.schema.StrField" name="StrField"/>
<fieldType name="StrTokenizer" class="solr.TextField">
<analyzer>
<tokenizer class="solr.NGramTokenizerFactory" minGramSize="2" maxGramSize="5"/>
</analyzer>
</fieldType>
<fieldType class="org.apache.solr.schema.TrieFloatField" name="TrieFloatField"/>
<fieldType class="org.apache.solr.schema.TextField" name="TextField">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<fieldType name="user_id" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
</analyzer>
</fieldType>
</types>
<fields>
<field indexed="true" multiValued="false" name="user_id" stored="true" type="StrField"/>
<field indexed="true" multiValued="false" name="company" stored="true" type="StrField"/>
<field indexed="true" multiValued="false" name="tins" stored="true" type="TrieFloatField"/>
<field indexed="true" multiValued="false" name="user_standard" stored="true" type="StrTokenizer"/>
<field indexed="true" multiValued="false" name="requests" stored="true" type="TrieFloatField"/>
<field indexed="true" multiValued="false" name="include" stored="true" type="TextField"/>
</fields>
<uniqueKey>(user_id,company,user_standard)</uniqueKey>
</schema>
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.