Lucene case sensitive & insensitive search

Question

I have a Lucene index which is currently case sensitive. I want to add the option of having a case insensitive search as a fall-back. This means that results that match the case will get more weight and will appear first. For example, if the number of results is limited to 10, and there are 10 matches which match my case, this is enough. If I only found 7 results, I can add 3 more results from the case-insensitive search.

My case is actually more complex, since I have items with different weights. Ideally, having a match with "wrong" case will add some weight. Needless to say, I do not want duplicate results.

One possible approach is to have 2 indexes. One with case and one without and search both. Naturally, there's some redundancy here, since I need to index twice.

Is there a better solution? Ideas?

Answer 1

Did you already tried copyField? see http://wiki.apache.org/solr/SchemaXml#Copy_Fields

If not define a new field B with a different configuration and copy field A into B via copyField

Answer 2

The Lucene search is case sensitive, it's just that all input is usually lower-cased upon passing through Queryparser , so it feels like it's case insensitive. In other words, don't lower-case your input before indexing, and don't lower-case your queries (ie pick an Analyzer that doesn't lower-case) keyword-analyzer for example.

[setLowercaseExpandedTerms][1](boolean lowercaseExpandedTerms)

you can index the terms using case sensitive analyzer and when u want case-insensitive query use a class which doesnot convert your terms to lowercase

look at Wildcard, Prefix, and Fuzzy queries

Lucene case sensitive & insensitive search

Question

2 answers

solution1
6 ACCPTED 2010-03-21 16:17:25

solution2
5 2010-03-22 06:54:22

Lucene case sensitive & insensitive search

Question

2 answers

solution1 6 ACCPTED 2010-03-21 16:17:25

solution2 5 2010-03-22 06:54:22

solution1
6 ACCPTED 2010-03-21 16:17:25

solution2
5 2010-03-22 06:54:22