Elasticsearch：完全禁用IDF以进行搜索结果评分

Question

This is my sample data in elasticsearch 这是我在Elasticsearch中的样本数据

{
    "_index": "12_index",
    "_type": "skill_strings",
    "_id": "AVKv-kM4axmY3fECZw9T",
    "_source": {
       "str": "PHP PHP PHP"
    }
 },
 {
    "_index": "12_index",
    "_type": "skill_strings",
    "_id": "AVKv-kNfaxmY3fECZw9U",
    "_source": {
       "str": "Javascript PHP Javascript Javascript"
    }
 }


"bool":{
  "must":[
    // some conditions
    {"match_phrase":{"str":"php"}}
  ],
  "should":[
    {"match_phrase":{"sentences":"Javascript"}}
  ]
}

norms is disable 规范已禁用

in the result set, php (with 16 occurrences) gets a score of 13.65 (rounded off) whereas Javascript with the same number of occurrences in another doc gets a lower score of 9.58 在结果集中，php（出现16次）得分为13.65（四舍五入），而另一个文档中出现次数相同的Javascript得分较低（9.58）

As per my use case irrespective of how rare a word is or how short/long the field is, i want a same score for the same term frequency. 根据我的用例，无论单词有多稀少或字段有多短，我都希望在相同的词频上获得相同的分数。

How can i do that ? 我怎样才能做到这一点？

Answer 1

Here are two potential ways: 这是两种可能的方法：

1) Custom similarity configuration. 1）自定义相似性配置。 See the example here for how this is possible: https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-similarity.html#scripted_similarity 请参阅此处的示例以了解这是如何实现的： https ： //www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-similarity.html#scripted_similarity

2) Create a Scripting Engine: 2）创建一个脚本引擎：

https://www.elastic.co/guide/en/elasticsearch/reference/master/modules-scripting-engine.html https://www.elastic.co/guide/zh-CN/elasticsearch/reference/master/modules-scripting-engine.html

In most cases, (1) should be easiest. 在大多数情况下，（1）应该最简单。

Elasticsearch：完全禁用IDF以进行搜索结果评分

问题描述

1 个解决方案

解决方案1
2 2016-02-10 05:28:14

Elasticsearch：完全禁用IDF以进行搜索结果评分

问题描述

1 个解决方案

解决方案1 2 2016-02-10 05:28:14

解决方案1
2 2016-02-10 05:28:14