[英]lucene filter case sensitive
I'm migrating from lucene 3.0.1 to 4.1.0. 我正在从Lucene 3.0.1迁移到4.1.0。 After few days of analysis I suppose there is a differenc in filtering of queries result in these versions.
经过几天的分析,我认为这些版本的查询过滤结果有所不同。 After migration I see difference in query result for the same queries and filters.
迁移后,我看到相同查询和过滤器的查询结果有所不同。
The thing looks as follows: 事情看起来如下:
I was using lucene 3.0.1 but for example StandardAnalyzer for IndexWriter was configured in this way: 我使用的是Lucene 3.0.1,但是例如,IndexWriter的StandardAnalyzer是这样配置的:
new StandardAnalyzer(Version.LUCENE_24)
The same configuration was used for QueryParser. QueryParser使用了相同的配置。 There are few Fields that are NOT_ANALYSED (means not indexed; is deprecated in 4.x) and this cause the problem after migration to 4.0.0 or 4.1.0.
很少有NOT_ANALYSED字段(意味着未建立索引;在4.x中已弃用),这会导致在迁移到4.0.0或4.1.0后出现问题。 The problem is that values of some Fileds that are NOT_ANALYZED are UPPER CASE.
问题是某些NOT_ANALYZED的Filed的值是大写。 The search process looks as as follows:
搜索过程如下所示:
I have found this ansewer regarding case sensitivity . 我发现这与区分大小写有关 。 I know that LowerCaseFilter is used in lucene 2.4 What I did is I re-built the index with 4.x but all NOT_ANALYZED values are now lower-case.
我知道在Lucene 2.4中使用LowerCaseFilter,我所做的是用4.x重建索引,但是现在所有的NOT_ANALYZED值都是小写的。 Then the problem disapeard.
然后问题消失了。
What could be the reason that for my solution using 3.0.3 case sensitivity "does not matter" and in 4.x "it matters". 对于我的使用3.0.3区分大小写的解决方案“无关紧要”,而在4.x中则“重要”的原因可能是什么。 Maybe some of you could explain me what is happening under the hood.
也许有些人可以向我解释幕后发生的事情。
Indexing and analyzing are two different things. 索引编制和分析是两件事。
Analyzing means the field is put through the Analyzer
of choice. 分析意味着该字段将通过所选的
Analyzer
。 Fields that are not analyzes are put in the index just the way they are. 未分析的字段将按原样放置在索引中。
If you index an uppercase string, without analyzing, it will stay uppercase in the index and will not be found using a lowercase query. 如果您对大写字符串进行索引而不进行分析,则它将在索引中保持大写形式,并且无法使用小写查询找到。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.