简体   繁体   English

在lucene索引中搜索

[英]search in lucene index

I created a lucene(3.0.1) index on a column to search with in text, On testing on this text : 我在要在文本中进行搜索的列上创建了lucene(3.0.1)索引,在对该文本进行测试时:

$GLD is a great example of why it does not make sense EVER to try and catch a falling knife.

It gives me result if I search by keyword "falling" but I am getting nothing on searching by "$GLD" 如果我通过关键字"falling"搜索,它会给我结果,但是通过"$GLD"进行搜索却一无所获

I am using standardAnalyzer : 我正在使用standardAnalyzer:

String longString = "$GLD is a great example of why it does not make sense EVER to try and catch a falling knife."

Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_30);
doc.add(new Field("data", longString, Store.YES, Field.Index.ANALYZED));

Because Field.Index.ANALYZED is set it should create tokens and $GLD should be present. 因为设置了Field.Index.ANALYZED ,所以它应该创建令牌并且应该存在$GLD Analyzer will remove stop words from the text, is word $GLD is also getting removed in the process. 分析器将从文本中删除停用词,因为在此过程中还将删除单词$GLD

Your document's field is changed by Analyzer. 分析器更改了文档的字段。 Why don't you use Analyzer on your query before you search. 在搜索之前,为什么不对查询使用分析器。 Meanwhile a QueryParser would help a lot. 同时,QueryParser将大有帮助。

You should check the StandardAnalyzer , the StandardAnalyzer may remove $ and may use LowerCaseFilter in its procedure (I'm not sure, I just know 2.3 and 4.1). 您应该检查StandardAnalyzerStandardAnalyzer可能会删除$并在其过程中使用LowerCaseFilter (我不确定,我只知道2.3和4.1)。 The LowerCaseFilter will make the words to lower case. LowerCaseFilter将使单词变为小写。 When you search upper case letter, you won't get any thing. 当搜索大写字母时,您将一无所获。

You can use Luke to check the tokenized result in index. 您可以使用Luke来检查索引中的标记化结果。

使用Luke来检查您的查询是否达到了您的预期效果也有帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM