简体   繁体   中英

Lucene: compare results across queries

I need to compare the relevance of the search results across different Lucene queries.

Actually I have an indexed set of text documents and when a search is done on this set I want to return not the N best results from this set but all the results which fit the query "good enough".

This "good enough" parameter will be configurable (say between 0 (document is absolutely irrelevant) and 1 (document is the best match possible)) but I want it to affect all queries in the same way.

From what I have found on the internet it is not a simple task. Could anybody give me a hint about how to approach this problem?

Thanks a lot!

即使您将分数标准化为[0,1]间隔,比较不同查询的分数显然也是不正确的,请参阅如何规范化Lucene分数?

If you want to compare two or more queries, I found an workaround. You can compare your highest scored document with your queryterm using the LevenstheinDistance or LuceneLevenstheinDistance(Damerau) class to get the distance between your queryterm and your result.

The result is the similarity between them. Do this for each query you want to compare against. Now you have a tool to compare your queries using the similarity of your queryterm and your highest result. You can now choose the query with the highest score of similarity and use this for next proper actions.

//Damerau LevenstheinDistance
LuceneLevenshteinDistance d = new LuceneLevenshteinDistance();

similiarity = d.getDistance(queryterm, yourResult );

I was just looking for the answer to this same question. Here's what I found in looking around:

While in general it is not possible to compare across queries , if you have certain restricted types of queries, such as a BooleanQuery consisting of only TermQuery s, then it may be possible to compare results across queries if you disable the coord boost in the BooleanQuery constructor .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM