简体   繁体   English

Lucene setboost不起作用

[英]Lucene setboost doesn't work

OUr team just upgrade lucene from 2.3 to 3.0 and we are confused about the setboost and getboost of document. 我们的团队只是将lucene从2.3升级到3.0,我们对文档的设置和获取感到困惑。 What we want is just set a boost for each document when add them into index, then when search it the documents in the response should have different order according to the boost I set. 我们想要的只是在将每个文档添加到索引中时为每个文档设置一个增强,然后在搜索它时,响应中的文档应该根据我设置的增强而具有不同的顺序。 But it seems the order is not changed at all, even the boost of each document in the search response is still 1.0. 但是似乎顺序根本没有改变,即使搜索响应中每个文档的提升仍然是1.0。 Could some one give me some hit? 有人可以给我一些打击吗? Following is our code: 以下是我们的代码:

    String[] a = new String[] { "schindler", "spielberg", "shawshank", "solace", "sorcerer", "stone", "soap",
                "salesman", "save" };
    List<String> strings = Arrays.asList(a);
    AutoCompleteIndex index = new Index();
    IndexWriter writer = new IndexWriter(index.getDirectory(), AnalyzerFactory.createAnalyzer("en_US"), true,
                MaxFieldLength.LIMITED);
    float i = 1f;
    for (String string : strings) {
        Document doc = new Document();
        Field f = new Field(AutoCompleteIndexFactory.QUERYTEXTFIELD, string, Field.Store.YES,
                Field.Index.NOT_ANALYZED);
        doc.setBoost(i);
        doc.add(f);
        writer.addDocument(doc);
        i += 2f;
    }

    writer.close();
    IndexReader reader2 = IndexReader.open(index.getDirectory());
    for (int j = 0; j < reader2.maxDoc(); j++) {
        if (reader2.isDeleted(j)) {
            continue;
        }

        Document doc = reader2.document(j);
        Field f = doc.getField(AutoCompleteIndexFactory.QUERYTEXTFIELD);
        System.out.println(f.stringValue() + ":" + f.getBoost() + ", docBoost:" + doc.getBoost());
        doc.setBoost(j);

    }

Thank you for your answer. 谢谢您的回答。 I have updated the code according to your suggestion, but it seems it still doesn't work. 我已经根据您的建议更新了代码,但似乎仍然无法正常工作。 It seems the order of the result has not been changed by boost and the score of each search reults are the same (1.0). 似乎结果的顺序没有通过boost更改,并且每个搜索结果的分数都相同(1.0)。 Please check my code below: 请在下面检查我的代码:

public void testScore() throws Exception { String[] a = new String[] { "schindler", "spielberg", "shawshank", "solace", "sorcerer", "stone", "soap", "salesman", "save" }; public void testScore()引发异常{String [] a = new String [] {“ schindler”,“ spielberg”,“ shawshank”,“ solace”,“ sorcerer”,“ stone”,“ soap”,“ salesman”, “救” }; List strings = Arrays.asList(a); 列出字符串= Arrays.asList(a); AutoCompleteIndex index = new Index(); AutoCompleteIndex index =新的Index(); IndexWriter writer = new IndexWriter(index.getDirectory(), AnalyzerFactory.createAnalyzer("en_US"), true, MaxFieldLength.LIMITED); IndexWriter writer = new IndexWriter(index.getDirectory(),AnalyzerFactory.createAnalyzer(“ en_US”),true,MaxFieldLength.LIMITED);

    float i = 1f;
    for (String string : strings) {
        Document doc = new Document();
        doc.add(new Field(AutoCompleteIndexFactory.QUERYTEXTFIELD, string, Field.Store.YES,
                Field.Index.NOT_ANALYZED));
        doc.setBoost(i);
        //            System.out.println(doc.getBoost());
        i += 2f;
        writer.addDocument(doc);
    }

    writer.close();


    BooleanQuery
            .setMaxClauseCount(BooleanQuery.getMaxClauseCount() < getMaxQueryTextEntry() ? getMaxQueryTextEntry()
                    : BooleanQuery.getMaxClauseCount());
    Term searchTerm = new Term(AutoCompleteIndexFactory.QUERYTEXTFIELD, "s");
    PrefixQuery query = new PrefixQuery(searchTerm);
    IndexSearcher searcher = new IndexSearcher(index.getDirectory());

    TopDocs docs = searcher.search(query, 10);
    ScoreDoc[] hits = docs.scoreDocs;
    for (ScoreDoc hit2 : hits) {
        String hit = searcher.doc(hit2.doc).get(AutoCompleteIndexFactory.QUERYTEXTFIELD);
        System.out.println(hit + " score:" + hit2.score);
        System.out.println(searcher.explain(query, hit2.doc));

    }

}

And the output is: 输出为:

Jun 17, 2010 4:12:18 PM INFO: 2010年6月17日下午4:12:18 INFO:

schindler score:1.0 1.0 = (MATCH) ConstantScoreQuery(querytexts:s*), product of: 1.0 = boost 1.0 = queryNorm 迅达得分:1.0 1.0 =(MATCH)ConstantScoreQuery(querytexts:s *),乘积为1.0 =提升1.0 = queryNorm

spielberg score:1.0 1.0 = (MATCH) ConstantScoreQuery(querytexts:s*), product of: 1.0 = boost 1.0 = queryNorm spielberg得分:1.0 1.0 =(MATCH)ConstantScoreQuery(querytexts:s *),乘积为:1.0 =提升1.0 = queryNorm

shawshank score:1.0 1.0 = (MATCH) ConstantScoreQuery(querytexts:s*), product of: 1.0 = boost 1.0 = queryNorm 肖申克得分:1.0 1.0 =(MATCH)ConstantScoreQuery(querytexts:s *),乘积为1.0 =提升1.0 = queryNorm

solace score:1.0 1.0 = (MATCH) ConstantScoreQuery(querytexts:s*), product of: 1.0 = boost 1.0 = queryNorm 安慰分数:1.0 1.0 =(MATCH)ConstantScoreQuery(querytexts:s *),乘积为:1.0 =提升1.0 = queryNorm

sorcerer score:1.0 1.0 = (MATCH) ConstantScoreQuery(querytexts:s*), product of: 1.0 = boost 1.0 = queryNorm 巫师得分:1.0 1.0 =(MATCH)ConstantScoreQuery(querytexts:s *),乘积为:1.0 =提升1.0 = queryNorm

stone score:1.0 1.0 = (MATCH) ConstantScoreQuery(querytexts:s*), product of: 1.0 = boost 1.0 = queryNorm 实际得分:1.0 1.0 =(MATCH)ConstantScoreQuery(querytexts:s *),乘积为:1.0 =提升1.0 = queryNorm

soap score:1.0 1.0 = (MATCH) ConstantScoreQuery(querytexts:s*), product of: 1.0 = boost 1.0 = queryNorm 肥皂评分:1.0 1.0 =(MATCH)ConstantScoreQuery(querytexts:s *),乘积为:1.0 =提升1.0 = queryNorm

salesman score:1.0 1.0 = (MATCH) ConstantScoreQuery(querytexts:s*), product of: 1.0 = boost 1.0 = queryNorm 推销员得分:1.0 1.0 =(MATCH)ConstantScoreQuery(querytexts:s *),乘积为:1.0 =提升1.0 = queryNorm

save score:1.0 1.0 = (MATCH) ConstantScoreQuery(querytexts:s*), product of: 1.0 = boost 1.0 = queryNorm 保存得分:1.0 1.0 =(MATCH)ConstantScoreQuery(querytexts:s *),乘积为:1.0 =提升1.0 = queryNorm

The document boost is meant to take effect when you search , not when you sequentially go over the documents in the index, like in your code sample. 文档增强功能应在您搜索时生效,而不是在代码示例中顺序浏览索引中的文档时生效。 Try to make the following experiment: 尝试进行以下实验:

  1. Index just two documents: the first with id 1, text "schindler" and boost 3.0; 仅索引两个文档:第一个ID为1,文本为“迅达”和boost 3.0。 the second with id 2, text "schindler" and boost 1.0. 第二个ID为2的文字“迅达”和boost 1.0。
  2. Open an IndexSearcher. 打开一个IndexSearcher。
  3. Search for "schindler" and see the order of documents according to their ids. 搜索“迅达”,然后根据其ID查看文档的顺序。 The first id should be 1, because of the higher boost. 第一个ID应该为1,因为提升幅度更高。

The meaning of document boost is: When all other scoring factors are equal, the document with the higher boost gets a higher score. 文档提升的含义是:当所有其他评分因子均相等时,具有更高提升的文档将获得更高的分数。 Please see Lucene's scoring documentation and explain() function . 请参阅Lucene的评分文档describe()函数

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM