lucene标题/内容搜索

Question

I'm storing my lucene docs like so: 我像这样存储我的Lucene文档：

Document doc = new Document();
doc.add(new TextField("contents", "Homer January, Lenny February"));
doc.add(new TextField("title", "2017 on call schedule.xls", Field.Store.YES));

Document doc = new Document();
doc.add(new TextField("contents", "Carl January, Frank February"));
doc.add(new TextField("title", "2018 on call schedule.xls", Field.Store.YES));

I can get a hit if I search for the exact title, or for like 如果搜索确切的标题或类似的字词，我会获得成功

but no hits if i try things like 但如果我尝试类似的话就没有成功

call
on call
xls

I've tried simple things like 我已经尝试过简单的事情，例如

 Query query1 = new QueryParser("title", analyzer).parse("on call");

and more complicated ideas like 还有更复杂的想法，例如

Builder bb = new BooleanQuery.Builder();
for(String chunk : "on call".split(" ")){
    bb.add(new TermQuery(new Term("title", chunk)), BooleanClause.Occur.SHOULD);
}
BooleanQuery booleanQuery = bb.build();

maybe I'm storing my Docs wrong? 也许我的文档存储错误？

I'm using the StandardAnalyzer on search & insert. 我在搜索和插入上使用StandardAnalyzer 。

Seems like I'm missing something quite fundamental here.. Anyone have any tips please? 似乎我在这里缺少了一些非常基本的东西。

Answer 1

I think, its always a good idea to visualize your terms before running your search. 我认为，在运行搜索之前可视化您的术语始终是一个好主意。 Below is image from Luke tool. 下面是来自Luke工具的图像。

That simply indicates that there is no term with schedule but schedule.xls . 这仅表明没有带有schedule术语，而是schedule.xls 。

I am using Lucene 6.6.6 and had to modify your code to , 我使用的是Lucene 6.6.6，不得不将您的代码修改为，

Document doc = new Document();

        doc.add(new TextField("contents", "Homer January, Lenny February",Store.YES));
        doc.add(new TextField("title", "2017 on call schedule.xls", Store.YES));

        iwriter.addDocument(doc);

        doc = new Document();
        doc.add(new TextField("contents", "Carl January, Frank February",Store.YES));
        doc.add(new TextField("title", "2018 on call schedule.xls", Store.YES));

        iwriter.addDocument(doc);

        iwriter.commit();

Now for searching 现在进行搜索

Your query parser is basically producing a query - title:schedule that means an exact search ( without wild cards ) on field title and since there are no such terms , you find zero hits. 您的查询解析器基本上会生成一个查询title:schedule ，这意味着要对字段title进行精确搜索（不带通配符），并且由于没有这样的术语，因此您找到零命中。

Modifying your query to - Query query1 = new QueryParser("title", analyzer).parse("schedule*"); 将您的查询修改为- Query query1 = new QueryParser("title", analyzer).parse("schedule*"); will get you two hits. 会给你带来两次成功。

So as a best practice, before searching , always try to have a look & visualize your indexed data. 因此，作为最佳实践，在搜索之前，请始终尝试查找和可视化索引数据。

lucene标题/内容搜索

问题描述

1 个解决方案

解决方案1
0 已采纳 2017-12-06 05:24:56

lucene标题/内容搜索

问题描述

1 个解决方案

解决方案1 0 已采纳 2017-12-06 05:24:56

解决方案1
0 已采纳 2017-12-06 05:24:56