简体   繁体   中英

why lucene doesn't return all the documents in the index?

I'm using Lucene 5.3 to index a set of documents and use BooleanQuery where each term in the query is boosted by some score.

My problem is when I search the index i get a lesser number of documents as hits than that are in my index.

    System.out.println( "docs in the index = " + reader.numDocs() );
     //e.g., docs in the index = 92
    TopDocs topDocs = indexSearcher.search( q, reader.numDocs() ); //this ensures no result is omitted from the search.
    ScoreDoc[] hits = topDocs.scoreDocs;
    System.out.println( "results found: " + topDocs.totalHits )
    //e.g., results found: 44

What is the reason for this behaviour? Does lucene ignore documents with a zero score?

How do I get all the documents in the index no matter what score they have?

Lucene will only return results which actually match the query. If you want to get all the documents as results, you need to make sure they all match. You can do this with a MatchAllDocsQuery :

Query query = new BooleanQuery.Builder()
        .add(new BooleanClause(new MatchAllDocsQuery(), BooleanClause.Occur.MUST))
        .add(new BooleanClause(myOldQuery, BooleanClause.Occur.SHOULD))
        .build();

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM