[英]Elasticsearch : Disable IDF completely for search result scoring
This is my sample data in elasticsearch 这是我在Elasticsearch中的样本数据
{
"_index": "12_index",
"_type": "skill_strings",
"_id": "AVKv-kM4axmY3fECZw9T",
"_source": {
"str": "PHP PHP PHP"
}
},
{
"_index": "12_index",
"_type": "skill_strings",
"_id": "AVKv-kNfaxmY3fECZw9U",
"_source": {
"str": "Javascript PHP Javascript Javascript"
}
}
"bool":{
"must":[
// some conditions
{"match_phrase":{"str":"php"}}
],
"should":[
{"match_phrase":{"sentences":"Javascript"}}
]
}
norms is disable 规范已禁用
in the result set, php (with 16 occurrences) gets a score of 13.65 (rounded off) whereas Javascript with the same number of occurrences in another doc gets a lower score of 9.58 在结果集中,php(出现16次)得分为13.65(四舍五入),而另一个文档中出现次数相同的Javascript得分较低(9.58)
As per my use case irrespective of how rare a word is or how short/long the field is, i want a same score for the same term frequency. 根据我的用例,无论单词有多稀少或字段有多短,我都希望在相同的词频上获得相同的分数。
How can i do that ? 我怎样才能做到这一点 ?
Here are two potential ways: 这是两种可能的方法:
1) Custom similarity configuration. 1)自定义相似性配置。 See the example here for how this is possible: https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-similarity.html#scripted_similarity
请参阅此处的示例以了解这是如何实现的: https : //www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-similarity.html#scripted_similarity
2) Create a Scripting Engine: 2)创建一个脚本引擎:
https://www.elastic.co/guide/en/elasticsearch/reference/master/modules-scripting-engine.html https://www.elastic.co/guide/zh-CN/elasticsearch/reference/master/modules-scripting-engine.html
In most cases, (1) should be easiest. 在大多数情况下,(1)应该最简单。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.