[英]Apache Solr filtering query contains “-” don't work
I have the problem with Apache Solr. 我有Apache Solr的问题。
Into my result i have parameter named url. 进入我的结果,我有一个名为url的参数。 It's returns some results, like this.
这样会返回一些结果。
http://domain.com/re-RU/someLink
http://domain.com/de-DE/someLink
http://domain.com/en-EN/someLink
http://domain.com/cl-EN/someLink
http://domain.com/ka-EN/someLink
When i added a filtering query parameter to my query: 当我向查询添加过滤查询参数时:
http://ip:port/solr/example/select?q=someSentence&fq=url:ru-RU&wt=json&indent=true
It's working very well, but only for de-DE
, ru-RU
landuages. 它工作得很好,但仅适用于
de-DE
, ru-RU
语言。
When i trying to filter something with en-EN
, i getting result contains cl-EN
, ka-EN
too 当我尝试使用
en-EN
过滤某些内容时,我得到的结果也包含cl-EN
, ka-EN
Where is the problem? 问题出在哪儿? How to resolve my issue?
如何解决我的问题?
You need to check your schema.xml as your url might be broken on "-" like in en-EN,it might be creating tokens en and EN separately . 您需要检查schema.xml,因为您的网址可能会像在en-EN中那样在“-”上断开,它可能会分别创建en和EN标记。 For example, if you are using StandardTokenizerFactory as your tokenizer class, then en-EN will be broken as en and EN, de-DE into de and DE.
例如,如果您使用StandardTokenizerFactory作为令牌生成器类,则en-EN将被分解为en和EN,将de-DE分解为de和DE。 Similarly when you are querying you need to check which tokenizer you should use while querying because if you are using StandardTokenizerFactory while querying then fq=en-EN will also be broken into tokens en and EN.
类似地,在查询时,您需要检查在查询时应使用哪个标记器,因为如果在查询时使用StandardTokenizerFactory,则fq = en-EN也会分为标记en和EN。 For more about tokenizers, please check : https://cwiki.apache.org/confluence/display/solr/Tokenizers
有关令牌生成器的更多信息,请检查: https : //cwiki.apache.org/confluence/display/solr/Tokenizers
Create an analyzer urlFilter
in your schema.xml
as below . 如下所示,在
schema.xml
创建分析器urlFilter
。
<fieldType name="urlFilter" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhiteSpaceTokenizerFactory"/>
<filter class="solr.TrimFilterFactory"/>
<filter class="solr.CommonGramsFilterFactory" words="stopwords.txt" ignoreCase="true"/>
<filter class="solr.WordDelimiterFilterFactory" generateNumberParts="1" stemEnglishPossessive="1"generateWordParts="1" preserveOriginal="1" catenateWords="1"/>
<filter class="solr.LowercaseFilterFactory"/>
</analyzer>
Then use above analyser as the type
for your url field in schema.xml
as below 然后使用上述分析器作为
schema.xml
url字段的type
,如下所示
<field name="url" type="urlFilter" indexed="true" stored="true"/>
And then, query like this 然后,像这样查询
http://ip:port/solr/example/select?q=someSentence&fq=url:*ru-RU*&wt=json&indent=true
This will 100% work . 这将100%起作用。 Let me know if that helps you :) .
让我知道这是否对您有帮助:)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.