简体   繁体   English

评估 10,000 条规则的最快算法是什么?

[英]What's the fastest algorithm to evaluate 10,000 rules?

I have pricing rules to determine how much discount I would give to customer.我有定价规则来确定我会给客户多少折扣。 Problem is I need lots of rules, like ~10,000 rules, and the performance would be very slow if I were to loop all 10,000 rules for each customer request.问题是我需要很多规则,比如大约 10,000 条规则,如果我要为每个客户请求循环所有 10,000 条规则,性能会非常慢。

There are many conditions that I need to check to apply certain discount:我需要检查许多条件才能应用某些折扣:

- Product type (clothes, electronics, etc)
- Product SKU
- Customer location
- Search date (e.g. >= 2019-01-01 And <= 2019-01-31)
- .
- .
- .
- ~30th conditions

Here are the example of rules I'd like to set:以下是我要设置的规则示例:

Rule 1: product type = 'clothes', then discount 10%
Rule 2: product type = 'electronics', then discount 5%
Rule 3: product type = 'clothes' AND customer location = 'AUSTRALIA', then discount 7%
.
.
.
Rule 10,000: ....

Also I want to make each rule have priority.我也想让每个规则都有优先权。 So if Rule 3 has higher priority than Rule 1, I want to apply discount using Rule 3.因此,如果规则 3 的优先级高于规则 1,我想使用规则 3 应用折扣。

The naive approach would be to loop all 10,000 rules and check each of the rules one by one whether they match the conditions or not.天真的方法是循环所有 10,000 条规则,并逐一检查每个规则是否符合条件。 But the performance would be very bad.但是性能会很差。 And what if I want to add another 10,000 rules.如果我想再添加 10,000 条规则怎么办。

I'm very interested to know if there is better approach for this instead of looping all rules.我很想知道是否有更好的方法而不是循环所有规则。

--- update This rule will need to be triggered everytime a user do search. --- update 每次用户搜索时都需要触发此规则。 There's a search bar which user can type the keywords he wants to find, and the page will return all the products that match the keywords.有一个搜索栏,用户可以输入他想要查找的关键字,页面将返回所有与关键字匹配的产品。 The result could reach up to 50 products, hence 50x we need to evaluate which rules applies to each product for each user search.结果最多可以达到 50 个产品,因此 50x 我们需要评估每个用户搜索的每个产品适用的规则。

This may be a bit of an overkill, but when I think speed, I think Hash Tables , where unique rules are stored as (rule, discount) pairs.这可能有点矫枉过正,但是当我想到速度时,我会想到Hash Tables ,其中唯一的规则存储为(rule, discount)对。

For this to work, you'll need to categorize your rule criteria (product type, country, etc).为此,您需要对规则标准(产品类型、国家/地区等)进行分类。 Second, you'll need to assign a number (enumerate) each member of each category:其次,您需要为每个类别的每个成员分配一个编号(枚举):

Countries[Australia = 1, New Zealand = 2, ...]

After that, split all of the rules with multiple acceptable criteria into separate rules:之后,将具有多个可接受标准的所有规则拆分为单独的规则:

Rule 3: product type = 'clothes' AND (customer location = 'AUSTRALIA' OR customer location = 'NEW ZEALAND'), then discount 7%

becomes变成

Rule 4: product type = 'clothes' AND customer location = 'AUSTRALIA', then discount 7%
Rule 5: product type = 'clothes' AND customer location = 'NEW ZEALAND', then discount 7%

Now you have an array of criteria to check.现在您有一系列要检查的标准。 If none are specified, you can leave a zero.如果没有指定,您可以保留零。 For instance for array of criteria:例如对于条件数组:

[product type, customer location, month]

you can have values你可以有价值观

['decorations', '', 'December']

which translate to翻译成

[23, 0, 12]

and if you have a total of, say, 8 types of criteria you want to check, you have a final array that looks like this如果您总共有 8 种类型的条件要检查,则最终数组如下所示

[0, 0, 0, 23, 0, 0, 12, 0]

Now is the time to check which is the specific rule that applies to this, by performing the hash function H() on the array in some form.现在是时候通过以某种形式对数组执行散列函数H()来检查适用于此的特定规则了。 You could just string the digits together:您可以将数字串在一起:

=H(0002300120)

or you could multiply each successive number with a greater power of 10 and then add them together (for less than ~25 criteria, because of 2 64 limit):或者您可以将每个连续数字与 10 的更大幂相乘,然后将它们相加(对于小于 ~25 的标准,因为 2 64 的限制):

=H(230000 + 120000000)

The beauty of Hash Tables is that they work almost instantly with O(1) , if they are made with enough space to begin with and have both good hash functions H() and collision resolving mechanism (because not all H() may be unique each time).哈希表的美妙之处在于它们几乎可以立即使用O(1) ,如果它们有足够的空间开始并且具有良好的哈希函数H()冲突解决机制(因为并非所有的H()可能都是唯一的)每一次)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 用于生成长度为10,000个字符的随机数字字符串的算法? - algorithm for generating a random numeric string, 10,000 chars in length? 如何在不获取OutOfMemoryError的情况下存储10,000乘10,000的二维数组? - How would I store a two dimensional array of 10,000 by 10,000 without getting a OutOfMemoryError? 对链表进行排序的最快算法是什么? - What's the fastest algorithm for sorting a linked list? 执行取幂的最快算法是什么? - What's the fastest algorithm to perform exponentiation? 架构 - 如何使用10,000台机器有效地爬网? - Architecture - How to efficiently crawl the web with 10,000 machine? 在1,000,000个总值中找到10,000个最大的值 - Find the 10,000 largest out of 1,000,000 total values 此分配最快的算法是什么? - What is the fastest algorithm for this assignment? 什么是将学生分成小组的最快的启发式算法? - What's the fastest heuristic algorithm to split students into groups? 将两个方格的总和表示为素数的最快算法是什么? - What's the fastest algorithm to represent a prime as sum of two squares? 在不同机器上找到数字中位数的最快算法是什么? - What's the fastest algorithm to find the median for numbers on different machines?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM