繁体   English   中英

Lucene .Net,自定义评分

[英]Lucene .Net, custom scoring

我有以下Lucene说明:

{1.25 = (MATCH) sum of:

  0.5 = (MATCH) weight(Caption:vrom^0.5 in 0) [MySimilarity], result of:
    0.5 = score(doc=0,freq=1 = termFreq=1
), product of:
      0.5 = queryWeight, product of:
        0.5 = boost
        1 = idf(docFreq=1, maxDocs=4)
        1 = queryNorm
      1 = fieldWeight in 0, product of:
        1 = tf(freq=1), with freq of:
          1 = termFreq=1
        1 = idf(docFreq=1, maxDocs=4)
        1 = fieldNorm(doc=0)

  0.75 = (MATCH) weight(Caption:vroma^0.75 in 0) [MySimilarity], result of:
    0.75 = score(doc=0,freq=1 = termFreq=1
), product of:
      0.75 = queryWeight, product of:
        0.75 = boost
        1 = idf(docFreq=1, maxDocs=4)
        1 = queryNorm
      1 = fieldWeight in 0, product of:
        1 = tf(freq=1), with freq of:
          1 = termFreq=1
        1 = idf(docFreq=1, maxDocs=4)
        1 = fieldNorm(doc=0)
}

并且我想通过查询权重将匹配结果过滤为最大匹配而不是匹配总和。

我需要做的是,从每个文档中,我都希望采用每个子句中给出的最高编号。 (在此示例中,我想将0.75作为匹配分数而不是1.25)。 这样做有可能甚至正确吗?

到目前为止,我所做的是创建一个相似性以更改分数的计算方式,但结果仍然缺少获得MAX而不是SUM的部分。

我正在使用Lucene .Net版本4.8(beta)。

先感谢您!

无需修改相似性即可。 代替布尔查询,使用DisjunctionMaxQuery

感谢您的解决方案,但我仍然有问题。 当我尝试这样做时,我得到的结果与以前相同。 我的代码如下(我从这里使用示例):

BooleanQuery finalQuery = new BooleanQuery();
DisjunctionMaxQuery q1 = new DisjunctionMaxQuery(0.01f);
Query query = new FuzzyQuery(new Term("Caption", "roma"));
q1.Add(query);
finalQuery.Add(q1, Occur.MUST);

我得到的结果是(与问题相同):

{1.25 = (MATCH) sum of:

  0.5 = (MATCH) weight(Caption:vrom^0.5 in 0) [MySimilarity], result of:
    0.5 = score(doc=0,freq=1 = termFreq=1
), product of:
      0.5 = queryWeight, product of:
        0.5 = boost
        1 = idf(docFreq=1, maxDocs=4)
        1 = queryNorm
      1 = fieldWeight in 0, product of:
        1 = tf(freq=1), with freq of:
          1 = termFreq=1
        1 = idf(docFreq=1, maxDocs=4)
        1 = fieldNorm(doc=0)

  0.75 = (MATCH) weight(Caption:vroma^0.75 in 0) [MySimilarity], result of:
    0.75 = score(doc=0,freq=1 = termFreq=1
), product of:
      0.75 = queryWeight, product of:
        0.75 = boost
        1 = idf(docFreq=1, maxDocs=4)
        1 = queryNorm
      1 = fieldWeight in 0, product of:
        1 = tf(freq=1), with freq of:
          1 = termFreq=1
        1 = idf(docFreq=1, maxDocs=4)
        1 = fieldNorm(doc=0)
}

我尝试了一下,没有修改“相似性”,但是我有相同的感觉。 在这种情况下的结果是:

{0.505973 = (MATCH) sum of:

  0.2023892 = (MATCH) weight(Caption:vrom^0.5 in 0) [DefaultSimilarity], result of:
    0.2023892 = score(doc=0,freq=1 = termFreq=1
), product of:
      0.3187582 = queryWeight, product of:
        0.5 = boost
        1.693147 = idf(docFreq=1, maxDocs=4)
        0.3765274 = queryNorm
      0.6349302 = fieldWeight in 0, product of:
        1 = tf(freq=1), with freq of:
          1 = termFreq=1
        1.693147 = idf(docFreq=1, maxDocs=4)
        0.375 = fieldNorm(doc=0)

  0.3035838 = (MATCH) weight(Caption:vroma^0.75 in 0) [DefaultSimilarity], result of:
    0.3035838 = score(doc=0,freq=1 = termFreq=1
), product of:
      0.4781373 = queryWeight, product of:
        0.75 = boost
        1.693147 = idf(docFreq=1, maxDocs=4)
        0.3765274 = queryNorm
      0.6349302 = fieldWeight in 0, product of:
        1 = tf(freq=1), with freq of:
          1 = termFreq=1
        1.693147 = idf(docFreq=1, maxDocs=4)
        0.375 = fieldNorm(doc=0)
}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM