简体   繁体   中英

How to get exact search results on top using Apache Lucene?

How to get best score searches on top using Apache Lucene?

1. State Authority
2. Authority State

Now user searches for "Authority State" or "State Authority", we are getting same results for both in above fashion. But for "Authority State" Search results should be

1. Authority State
2. State Authority

Following is lucene query on fields:

name:Authority State* 
name:Authority State
name:Authority*
name:State*

for (String field : INDEXED_FIELDS) {
           bool.should(qb.keyword().wildcard().onField(field).matching(userInputBuilder.toString()).createQuery());
        }

        for (String field : INDEXED_FIELDS) {
          for (String match : pattern) {
               bool.should(qb.keyword().onField(field).matching(match).createQuery());
          }
        }

There is no sorting on results.

Could anyone suggest how to get exact results?

The keyword query type just looks to match the same tokens of the input, not taking into consideration the order.

When you need it to take into consideration the order of the tokens within the phrase use the phrase query:

Query query = queryBuilder
                .phrase()
                    .withSlop( 2 )//or other options of the Phrase query
                    .onField( field )
                    .sentence( userInputBuilder.toString() )
                .createQuery();

You might also be interested in trying out the latest "Simple Query Builder" .

In case you're interested into "debugging" the scores, you could have the query engine output not just the results but also the score value and the evaluation formula used for each hit:

List<Object[]> results = (List<Object[]>) fullTextSession
    .createFullTextQuery( mltQuery, Coffee.class )
    .setProjection( ProjectionConstants.THIS, ProjectionConstants.SCORE, ProjectionConstants.EXPLANATION )
    .list();

This will get you, for each hit, an array of three elements:

  1. the matched entity instance
  2. the score value
  3. a string explaining how it was scored

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM