简体繁体中英

Apache Solr: Correct use of CompoundWordFilter

原文 2011-08-27 18:06:55 8 2 solr

I'm trying to figure out how to best configure Solr for my app. I'm indexing (mostly german) PDF-Documents, and I'm using dismax queries to query Solr.

If a document contains the word "Firmenprofil" (a german compound word, -> 'company profile'), it will only be returned in queries for exactly that word. However, it would be desirable for queries only containing "Profil" to also return this document.

I downloaded a german dictionary file and applied a DictionaryCompoundWordTokenFilter to both the index- and the query-analyzer.

The Problem is, that the filter decomposes the query into very small parts (eg "pro" in the case of "Firmenprofil" which then results in having all sorts of documents that contain words like "Product" returned...).

I tried removing the Filter from the query-analyzer which leads to solr not finding the document at all. I also tried leaving the query-filter in, but explicitly setting the onlyLongestMatch -option to true, but that didn't seem to have any effect at all.

2 answers

Ok, seems like my dictionary file was simply too big (~20mb). I replaced it with a more compact one and now it works just fine...

Without your actual config files, its a bit of a guessing game.

Did you check if profil is part of the dictionary?

Should I use Apache Drill or Apache Solr?

How to use acronyms in Apache Solr?

How to use Apache Solr with Laravel 9

Should I Use Apache Solr

How to use Apache solr with java

Apache Solr: Can I use solr stats component?

Correct use case of multiple cores in Solr 4

Which algorithm does Apache Solr use for sorting?

does solr 3.6 use apache lucene for sorting?

Apache Solr topTerms (LukeRequestHandler) not giving correct token count

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Should I use Apache Drill or Apache Solr? How to use acronyms in Apache Solr? How to use Apache Solr with Laravel 9 Should I Use Apache Solr How to use Apache solr with java Apache Solr: Can I use solr stats component? Correct use case of multiple cores in Solr 4 Which algorithm does Apache Solr use for sorting? does solr 3.6 use apache lucene for sorting? Apache Solr topTerms (LukeRequestHandler) not giving correct token count

Related Tags

Apache Solr: Correct use of CompoundWordFilter

Question

2 answers

solution1
1 2011-09-01 07:29:24

solution2
0 ACCPTED 2011-08-28 15:19:51

Apache Solr: Correct use of CompoundWordFilter

Question

2 answers

solution1 1 2011-09-01 07:29:24

solution2 0 ACCPTED 2011-08-28 15:19:51

solution1
1 2011-09-01 07:29:24

solution2
0 ACCPTED 2011-08-28 15:19:51