分層評分Lucene或術語治療

Question

我試圖將興趣資料轉換為一些Lucene查詢。

給定標題術語和一些擴展術語，采用JSON格式，例如

{"title":"Donald Trump", "Expansion":[["republic","republican"],["democratic","democrat"],["campaign"]]}

相應的Lucene查詢可以是如下的BooleanQuery（將標題項提升因子設置為3.0，而將擴展項提升因子設置為1.0）。

+(text:donald^3.0 text:trump^3.0 (text:democrat text:democratic) (text:republic text:republican) text:campaign)

使用IndexSearcher's explain()方法，

匹配的文檔，例如

I know people just want to find a way to be famous without taking any risks, republic republican Donald Trump Campaign.

得分9.0

3.0 = weight(text:donald^3.0 in 0) [TitleExpansionSimilarity], result of:
    3.0 = score(doc=0,freq=1.0), product of:
      3.0 = queryWeight, product of:
        3.0 = boost
        1.0 = idf(docFreq=201, maxDocs=201)
        1.0 = queryNorm
      1.0 = fieldWeight in 0, product of:
        1.0 = tf(freq=1.0), with freq of:
          1.0 = termFreq=1.0
        1.0 = idf(docFreq=201, maxDocs=201)
        1.0 = fieldNorm(doc=0)
  3.0 = weight(text:trump^3.0 in 0) [TitleExpansionSimilarity], result of:
    3.0 = score(doc=0,freq=1.0), product of:
      3.0 = queryWeight, product of:
        3.0 = boost
        1.0 = idf(docFreq=201, maxDocs=201)
        1.0 = queryNorm
      1.0 = fieldWeight in 0, product of:
        1.0 = tf(freq=1.0), with freq of:
          1.0 = termFreq=1.0
        1.0 = idf(docFreq=201, maxDocs=201)
        1.0 = fieldNorm(doc=0)
  2.0 = sum of:
    1.0 = weight(text:republic in 0) [TitleExpansionSimilarity], result of:
      1.0 = fieldWeight in 0, product of:
        1.0 = tf(freq=1.0), with freq of:
          1.0 = termFreq=1.0
        1.0 = idf(docFreq=201, maxDocs=201)
        1.0 = fieldNorm(doc=0)
    1.0 = weight(text:republican in 0) [TitleExpansionSimilarity], result of:
      1.0 = fieldWeight in 0, product of:
        1.0 = tf(freq=1.0), with freq of:
          1.0 = termFreq=1.0
        1.0 = idf(docFreq=201, maxDocs=201)
        1.0 = fieldNorm(doc=0)
  1.0 = weight(text:campaign in 0) [TitleExpansionSimilarity], result of:
    1.0 = fieldWeight in 0, product of:
      1.0 = tf(freq=1.0), with freq of:
        1.0 = termFreq=1.0
      1.0 = idf(docFreq=201, maxDocs=201)
      1.0 = fieldNorm(doc=0)

有什么方法可以重寫Lucene評分功能，也可以對BooleanQuery（text：republic text：republican）進行評分。 群集["republic","republican"]作為“ republic”的匹配權重或“ republican”的匹配權重的最大值？

1.0 = MAX(instead of sum) of:
    1.0 = weight(text:republic in 0) [TitleExpansionSimilarity], result of:
      1.0 = fieldWeight in 0, product of:
        1.0 = tf(freq=1.0), with freq of:
          1.0 = termFreq=1.0
        1.0 = idf(docFreq=201, maxDocs=201)
        1.0 = fieldNorm(doc=0)
    1.0 = weight(text:republican in 0) [TitleExpansionSimilarity], result of:
      1.0 = fieldWeight in 0, product of:
        1.0 = tf(freq=1.0), with freq of:
          1.0 = termFreq=1.0
        1.0 = idf(docFreq=201, maxDocs=201)
        1.0 = fieldNorm(doc=0)

Answer 1

不是通過Lucene的QueryParser語法，而是可以使用DisjunctionMaxQuery ，而不是BooleanQuery來組合查詢和得分以及其子查詢的最高得分，而不是子查詢得分的總和。

分層評分Lucene或術語治療

問題描述

1 個解決方案

解決方案1
0 2016-06-08 15:40:09

分層評分Lucene或術語治療

問題描述

1 個解決方案

解決方案1 0 2016-06-08 15:40:09

解決方案1
0 2016-06-08 15:40:09