[英]lucene title/content searching
I'm storing my lucene docs like so: 我像这样存储我的Lucene文档:
Document doc = new Document();
doc.add(new TextField("contents", "Homer January, Lenny February"));
doc.add(new TextField("title", "2017 on call schedule.xls", Field.Store.YES));
Document doc = new Document();
doc.add(new TextField("contents", "Carl January, Frank February"));
doc.add(new TextField("title", "2018 on call schedule.xls", Field.Store.YES));
I can get a hit if I search for the exact title, or for like 如果搜索确切的标题或类似的字词,我会获得成功
2017
but no hits if i try things like 但如果我尝试类似的话就没有成功
call
on call
xls
I've tried simple things like 我已经尝试过简单的事情,例如
Query query1 = new QueryParser("title", analyzer).parse("on call");
and more complicated ideas like 还有更复杂的想法,例如
Builder bb = new BooleanQuery.Builder();
for(String chunk : "on call".split(" ")){
bb.add(new TermQuery(new Term("title", chunk)), BooleanClause.Occur.SHOULD);
}
BooleanQuery booleanQuery = bb.build();
maybe I'm storing my Docs wrong? 也许我的文档存储错误?
I'm using the StandardAnalyzer
on search & insert. 我在搜索和插入上使用
StandardAnalyzer
。
Seems like I'm missing something quite fundamental here.. Anyone have any tips please? 似乎我在这里缺少了一些非常基本的东西。
I think, its always a good idea to visualize your terms before running your search. 我认为,在运行搜索之前可视化您的术语始终是一个好主意。 Below is image from Luke tool.
下面是来自Luke工具的图像。
That simply indicates that there is no term with schedule
but schedule.xls
. 这仅表明没有带有
schedule
术语,而是schedule.xls
。
I am using Lucene 6.6.6 and had to modify your code to , 我使用的是Lucene 6.6.6,不得不将您的代码修改为,
Document doc = new Document();
doc.add(new TextField("contents", "Homer January, Lenny February",Store.YES));
doc.add(new TextField("title", "2017 on call schedule.xls", Store.YES));
iwriter.addDocument(doc);
doc = new Document();
doc.add(new TextField("contents", "Carl January, Frank February",Store.YES));
doc.add(new TextField("title", "2018 on call schedule.xls", Store.YES));
iwriter.addDocument(doc);
iwriter.commit();
Now for searching 现在进行搜索
Your query parser is basically producing a query - title:schedule
that means an exact search ( without wild cards ) on field title
and since there are no such terms , you find zero hits. 您的查询解析器基本上会生成一个查询
title:schedule
,这意味着要对字段title
进行精确搜索(不带通配符),并且由于没有这样的术语,因此您找到零命中。
Modifying your query to - Query query1 = new QueryParser("title", analyzer).parse("schedule*");
将您的查询修改为-
Query query1 = new QueryParser("title", analyzer).parse("schedule*");
will get you two hits. 会给你带来两次成功。
So as a best practice, before searching , always try to have a look & visualize your indexed data. 因此,作为最佳实践,在搜索之前,请始终尝试查找和可视化索引数据。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.