[英]Elasticsearch - How to boost score by the results of an aggregation?
My use case is as follows: Execute a search against Products and boost the score by its salesRank relative to the other documents in the results. 我的用例如下:对产品执行搜索,并通过其salesRank相对于结果中的其他文档提高分数。 The top 10% sellers should be boosted by a factor of 1.5 and the top 25-10% should be boosted by a factor of 1.25.
前10%的卖家应该提高1.5倍,而前25-10%的卖家应该提高1.25倍。 The percentiles are calculated on the results of the query, not the entire data set.
百分位数是根据查询结果计算的,而不是整个数据集。 This is feature is being used for on-the-fly instant results as the user types, so single character queries would still return results.
当用户键入时,此功能用于即时即时结果,因此单个字符查询仍将返回结果。
So for example, if I search for "Widget" and get back 100 results, the top 10 sellers returned will get boosted by 1.5 and the top 10-25 will get boosted by 1.25. 因此,例如,如果我搜索“Widget”并获得100个结果,那么返回的前10名卖家将获得1.5的提升,而前10-25名将获得1.25的提升。
I immediately thought of using the percentiles aggregation feature to calculate the 75th and 90th percentiles of the result set. 我立即想到使用百分位数聚合特征来计算结果集的第75和第90百分位数。
POST /catalog/product/_search?_source_include=name,salesRank
{
"query": {
"match_phrase_prefix": {
"name": "N"
}
},
"aggs": {
"sales_rank_percentiles": {
"percentiles": {
"field" : "salesRank",
"percents" : [75, 90]
}
}
}
}
This gets me the following: 这让我得到以下信息:
{
"hits": {
"total": 142,
"max_score": 1.6653868,
"hits": [
{
"_score": 1.6653868,
"_source": {
"name": "nylon",
"salesRank": 46
}
},
{
"_score": 1.6643861,
"_source": {
"name": "neon",
"salesRank": 358
}
},
..... <SNIP> .....
]
},
"aggregations": {
"sales_rank_percentiles": {
"values": {
"75.0": 83.25,
"90.0": 304
}
}
}
}
So great, that gives me the results and the percentiles. 太棒了,这给了我结果和百分位数。 But I would like to boost "neon" above "nylon" because it's a top 10% seller in the results (note: in our system, the salesRank value is descending in precedence, higher value = more sales).
但是我想在“尼龙”之上增加“霓虹灯”,因为它在结果中是前10%的卖家(注意:在我们的系统中,salesRank值优先下降,更高价值=更多销售)。 The text relevancy is very low since only one character was supplied, so sales rank should have a big effect.
由于只提供了一个字符,因此文本相关性非常低,因此销售排名应该会产生很大影响。
It seems that a function core query could be used here, but all of the examples in the documentation uses doc[] to use values from the document. 这里似乎可以使用函数核心查询 ,但文档中的所有示例都使用doc []来使用文档中的值。 There aren't any for using other information from the top-level of the response, eg "aggs" {}.
没有任何使用来自顶层响应的其他信息,例如“aggs”{}。 I would basically like to boost a document if its sales rank falls within the 100-90th and 89th-75th percentiles, by 1.5 and 1.25 respectively.
如果销售排名分别在第100-90和第89-75百分位数,分别为1.5和1.25,我基本上想要提升一份文件。
Is this something Elasticsearch supports or am I going to have to roll my own with a custom script or plugin? 这是Elasticsearch支持的东西,还是我将不得不使用自定义脚本或插件来推广自己的东西? Or try a different approach entirely?
或者完全尝试不同的方法? My preference would be to pre-calculate percentiles, index them, and do a constant score boost, but stakeholder prefers the run-time calculation.
我倾向于预先计算百分位数,对其进行索引,并持续得分,但利益相关者更喜欢运行时计算。
I'm using Elasticsearch 1.2.0. 我正在使用Elasticsearch 1.2.0。
What if you keep sellers as a parent document and periodically updates their stars (and some boosting factor), say, via some worker. 如果您将卖家作为父文件并定期更新他们的明星(以及一些提升因素),例如,通过一些工作人员,该怎么办? Then you match products using
has_parent
query, and use a combination of score mode, custom score query to match top products from top sellers? 然后使用
has_parent
查询匹配产品,并使用得分模式,自定义得分查询的组合来匹配来自畅销书的顶级产品?
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.