简体   繁体   English

Apache Solr过滤查询包含“-”不起作用

[英]Apache Solr filtering query contains “-” don't work

I have the problem with Apache Solr. 我有Apache Solr的问题。

Into my result i have parameter named url. 进入我的结果,我有一个名为url的参数。 It's returns some results, like this. 这样会返回一些结果。

http://domain.com/re-RU/someLink
http://domain.com/de-DE/someLink
http://domain.com/en-EN/someLink
http://domain.com/cl-EN/someLink
http://domain.com/ka-EN/someLink

When i added a filtering query parameter to my query: 当我向查询添加过滤查询参数时:

http://ip:port/solr/example/select?q=someSentence&fq=url:ru-RU&wt=json&indent=true

It's working very well, but only for de-DE , ru-RU landuages. 它工作得很好,但仅适用于de-DEru-RU语言。

When i trying to filter something with en-EN , i getting result contains cl-EN , ka-EN too 当我尝试使用en-EN过滤某些内容时,我得到的结果也包含cl-ENka-EN

Where is the problem? 问题出在哪儿? How to resolve my issue? 如何解决我的问题?

You need to check your schema.xml as your url might be broken on "-" like in en-EN,it might be creating tokens en and EN separately . 您需要检查schema.xml,因为您的网址可能会像在en-EN中那样在“-”上断开,它可能会分别创建en和EN标记。 For example, if you are using StandardTokenizerFactory as your tokenizer class, then en-EN will be broken as en and EN, de-DE into de and DE. 例如,如果您使用StandardTokenizerFactory作为令牌生成器类,则en-EN将被分解为en和EN,将de-DE分解为de和DE。 Similarly when you are querying you need to check which tokenizer you should use while querying because if you are using StandardTokenizerFactory while querying then fq=en-EN will also be broken into tokens en and EN. 类似地,在查询时,您需要检查在查询时应使用哪个标记器,因为如果在查询时使用StandardTokenizerFactory,则fq = en-EN也会分为标记en和EN。 For more about tokenizers, please check : https://cwiki.apache.org/confluence/display/solr/Tokenizers 有关令牌生成器的更多信息,请检查: https : //cwiki.apache.org/confluence/display/solr/Tokenizers

Create an analyzer urlFilter in your schema.xml as below . 如下所示,在schema.xml创建分析器urlFilter

<fieldType name="urlFilter" class="solr.TextField" positionIncrementGap="100">
    <analyzer>
      <tokenizer class="solr.WhiteSpaceTokenizerFactory"/>
      <filter class="solr.TrimFilterFactory"/>
      <filter class="solr.CommonGramsFilterFactory" words="stopwords.txt" ignoreCase="true"/>
      <filter class="solr.WordDelimiterFilterFactory" generateNumberParts="1" stemEnglishPossessive="1"generateWordParts="1" preserveOriginal="1" catenateWords="1"/>
      <filter class="solr.LowercaseFilterFactory"/>
    </analyzer>

Then use above analyser as the type for your url field in schema.xml as below 然后使用上述分析器作为schema.xml url字段的type ,如下所示

<field name="url" type="urlFilter" indexed="true" stored="true"/>

And then, query like this 然后,像这样查询

http://ip:port/solr/example/select?q=someSentence&fq=url:*ru-RU*&wt=json&indent=true

This will 100% work . 这将100%起作用。 Let me know if that helps you :) . 让我知道这是否对您有帮助:)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM