简体   繁体   English

弹性搜索排序预处理

[英]Elastic Search sort preprocessing

I have an index in ES that has, in addition to other fields, revenue_amount and revenue_currency fields. 我在ES中有一个索引,除其他字段外,该索引还具有Revenue_amount和Revenue_currency字段。 The revenue is stored in different currencies. 收入以不同的货币存储。 At run time, all currencies are converted to USD and rendered. 在运行时,所有货币都将转换为USD并进行渲染。

Now, I would like to support sorting on the revenue_amount field. 现在,我想支持在Revenue_amount字段上进行排序。 The problem is ES sorts results in terms of revenue prior to converting to USD, and so a revenue returned at the top might not be the highest revenue after converting to USD. 问题在于,ES在转换为美元之前会根据收入对结果进行排序,因此转换为美元后,返回顶部的收入可能不是最高的收入。

I was wondering, if its possible that before sorting, ES calls a user-defined function that changes a field value, and then apply sort afterwards? 我想知道,是否有可能在排序之前,ES调用用户定义的函数来更改字段值,然后再应用排序? Something like this: 像这样:

revenue_converted = convertToUSD(revenue) Revenue_converted = convertToUSD(收益)

And so the sorting will be applied to revenue_converted, rather than revenue. 因此,排序将应用于收入转换后的收入,而不是收入。

I know I can convert the currencies at index time, but that will require refreshing the index every time the rates are updated, and so I would like to avoid it, if possible. 我知道我可以在索引时间转换货币,但是这需要在每次汇率更新时刷新索引,因此,如果可能的话,我想避免这种情况。

You have two ways of achieving this: one is by using script-based sorting as keety mentioned: 您有两种方法可以实现此目的:一种是通过使用基于脚本的排序(如keety所述):

{
    "query" : {
        ....                                    <--- your query goes here
    },
    "sort" : {
        "_script" : {
            "script" : "doc.revenue_amount.value * usd_conversion_rate",
            "type" : "number",
            "params" : {
                "usd_conversion_rate" : 0.4273  <--- the conversion rate to USD
            },
            "order" : "desc"
        }
    }
}

The usd_conversion_rate factor is the conversion rate to USD. usd_conversion_rate系数是USD的转换率。 So for instance, if 1 USD is worth 2.34 units of another currency, the usd_conversion_rate factor would be 1 / 2.34 (or 0.4273 ). 因此,例如,如果1美元价值2.34单位另一种货币,则usd_conversion_rate系数将为1 / 2.34 (或0.4273 )。 When multiplied with revenue_amount it'll give you the amount in the USD reference currency. 乘以revenue_amount ,您会得到以美元为参考货币的金额。

Script-based sorting is not very performant, though, and the recommendation is to use a function_score so results can be sorted by score instead. 但是,基于脚本的排序不是很有效,建议使用function_score以便可以按分数对结果进行排序。 That leads us to the second way of achieving what you need and it goes like this. 这使我们找到了实现您所需要的第二种方式,它就是这样。 One way is by using a script_score function, but that involves scripting again. 一种方法是使用script_score函数,但这需要再次编写脚本。

{
  "query": {
    "function_score": {
      "query": {},
      "functions": [
        {
          "script_score": {
            "script": "doc.revenue_amount.value * usd_conversion_rate",
            "boost_mode": "replace",
            "params": {
              "usd_conversion_rate": 0.4273
            }
          }
        }
      ]
    }
  }
}

Since our above script was very simple (ie multiply a field by some factor), the simplest way would involve using field_value_factor and it goes like this: 由于我们上面的脚本非常简单(即,将某个字段乘以某个因子),因此最简单的方法将涉及使用field_value_factor ,其过程如下所示:

{
  "query": {
    "function_score": {
      "query": {
        ...                              <--- your query goes here
      },
      "functions": [
        {
          "field_value_factor": {
            "field": "revenue_amount",
            "boost_mode": "replace",
            "factor": 0.4273             <--- insert the conversion rate here
          }
        }
      ]
    }
  }
}

UPDATE 更新

According to your latest comment, it seems that the right option for you is to use script_score after all. 根据您的最新评论,看来您的正确选择毕竟是使用script_score The idea here would be to input all your currency rates available in your lookup table as parameters of your script_score script and then use the proper one according to the value of the revenue_currency field. 这里的想法是输入您在查询表中可用的所有货币汇率作为script_score脚本的参数,然后根据revenue_currency字段的值使用适当的revenue_currency

{
  "query": {
    "function_score": {
      "query": {},
      "functions": [
        {
          "script_score": {
            "script": "doc.revenue_amount.value * (doc.revenue_currency.value == 'EUR' ? EUR : (doc.revenue_currency.value == 'AUD' ? AUD : 1))",
            "boost_mode": "replace",
            "params": {
              "EUR": 0.4945,
              "AUD": 0.5623
            }
          }
        }
      ]
    }
  }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM