简体   繁体   中英

Solr synonym graph filter not working after other filter

I'm trying to convert 15.6" searches to 15.6 inch . The idea was first replace 15.6" to 15.6 " and then match the " with the synonym rule " => inch . I created the type definition:

<fieldType name="text_de" class="solr.TextField" positionIncrementGap="100">
    <analyzer>
        <tokenizer class="solr.WhitespaceTokenizerFactory" />
        <filter class="solr.PatternReplaceFilterFactory" pattern='^([0-9]+([,.][0-9]+)?)(")$' replacement="$1 $3" />
        <filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms.txt" />
    </analyzer>
</fieldType>

but it's not working! If I input 15.6" I get 15.6 " , but when I input 15.6 " I get what I want - 15.6 inch .

Why doesn't it work? Am I missing something?

EDIT:

Solr Analysis: 不工作

在职的

The issue is that 15.6 " is still a single token after your pattern replace filter - just creating a token with a space in it will not split it.

You can see that it's still kept as a single token as there is no | on the line (which separates the tokens).

Add a Word Delimiter Filter after it (it seems from your analysis chain that you already have one, it's just not included in your question), or, better, do the replacement in a PatternReplaceCharFilterFactory before the tokenizer gets the task to split the input into separate tokens:

<analyzer>
  <charFilter class="solr.PatternReplaceCharFilterFactory" pattern='^([0-9]+([,.][0-9]+)?)(")$' replacement="$1 $3" />
  <tokenizer ...>

You might have to massage the pattern matcher a bit (ie lose the ^ and $ which isn't respected by Solr any way, iirc) depending on your input (since it'll now be applied to the whole input string - make sure that "Macbook 15.6" 256GB" is matched approriately).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM