简体   繁体   English

OR 谓词优化

[英]OR predicate optimization

Suppose I have an entity with 3 attributes: A1, A2, A3 such that:假设我有一个具有 3 个属性的实体:A1、A2、A3,这样:

  1. A1 can only have values: 1, 2, 3 A1 只能有值:1、2、3
  2. A2 can only have values: 10, 20, 30, 40, 50 A2 只能有值:10、20、30、40、50
  3. A3 can only have values: 100, 200 A3 只能有值:100、200

And a number of rules, for example:还有一些规则,例如:

R1: (A1 in (1, 2)) AND (A2 in (20, 40, 50)) AND (A3 IN (100))
R2: (A1 in (1, 3)) AND (A2 in (10, 30)) AND (A3 in (200))
R3: (A1 in (1, 2)) AND (A2 in (10)) AND (A3 in (100))

Then there is a predicate: R = R1 or R2 or R3 , which I would like to minimize.然后有一个谓词: R = R1 or R2 or R3 ,我想最小化。 The thing is that A1=1 covers all possible variations of A2 and A3 , so we can bring it into a separate clause: R = (A1=1) or (the rest)问题是A1=1涵盖了A2A3的所有可能变体,因此我们可以将其放入单独的子句中: R = (A1=1) or (the rest)

I've tried boolean minimization methods by declaring variables as a=(A1=1), b=(A1=2), ..., k=(A3=200) , however it does not seem to work, because:我通过将变量声明为a=(A1=1), b=(A1=2), ..., k=(A3=200)尝试了 boolean 最小化方法,但是它似乎不起作用,因为:

  1. boolean optimizer is not aware of all the values of attribute A boolean 优化器不知道属性 A 的所有值
  2. boolean variables are not independent When trying to address these issues, the expression is becoming too complex and neither QMC, not Espresso is not able to minimize it in the desired way. boolean 变量不是独立的 在尝试解决这些问题时,表达式变得过于复杂,QMC 和 Espresso 都无法以所需的方式将其最小化。

I've also tried to store each-to-each mappings and in case one of them have all the values of another one, use it as an aggregation anchor, then remove it and repeat, but it takes eternity and quite a lot of RAM.我还尝试存储每个映射,如果其中一个具有另一个映射的所有值,请将其用作聚合锚,然后将其删除并重复,但这需要永恒和相当多的 RAM .

Maybe we can represent attribute values as a set and address it from the set theory point of view.也许我们可以将属性值表示为一个集合,并从集合论的角度来解决它。

Have you ever faced a problem this?你有没有遇到过这样的问题? Are you aware of better ways to solve it?你知道更好的方法来解决它吗? (heuristics are ok as well) (启发式也可以)

A method of optimizing the expression for the evaluation could be to split the rules repeatedly on the attribute with the fewest values.优化评估表达式的一种方法可以是在具有最少值的属性上重复拆分规则。 After this expansion you could collect the values again for those who have the same ones on the last clause.在此扩展之后,您可以再次收集那些在最后一个子句中具有相同值的值。

  1. Make 2 groups, one for the rules that accept A3 = 100 and one for the rules that accept A3 = 200. A rule can end up in both groups.制作 2 组,一组用于接受 A3 = 100 的规则,一组用于接受 A3 = 200 的规则。一个规则可以在两个组中结束。 Then modify the rule in the group so that it only accepts the value for the group and not the other one.然后修改组中的规则,使其只接受该组的值而不接受另一个值。

  2. Group those groups again on the values of A1 using the same logic.使用相同的逻辑将这些组再次分组到 A1 的值上。

You would end up with an expanded expression like this:你最终会得到一个像这样的扩展表达式:

A3 = 100 AND (
    (A1 = 1 AND A2 IN (10, 20, 40, 50)) OR
    (A1 = 2 AND A2 IN (10, 20, 40, 50)))
OR A3 = 200 AND (
    (A1 = 1 AND A2 IN (10, 30)) OR
    (A1 = 3 AND A2 IN (10, 30)))

Basically we are constructing a tree with the values for A3 at depth 1 and the values for A1 at depth 2 and the values for A2 at depth 3. If there is a path from root to leaf using the attribute values then the rule is fullfilled otherwise it isnt.基本上,我们正在构建一棵树,其中 A3 的值位于深度 1,A1 的值位于深度 2,A2 的值位于深度 3。如果使用属性值存在从根到叶的路径,则规则被满足,否则它不是。

After that you can merge all nodes with the same subtree and the same parent.之后,您可以合并具有相同子树和相同父级的所有节点。 For this you can compare the leaves of all nodes with the same parent and if they match you can merge the nodes.为此,您可以将所有节点的叶子与相同的父节点进行比较,如果它们匹配,您可以合并节点。 After that you go one level up and compare the nodes with the same parent and so on.之后,您将 go 向上一级并比较具有相同父级的节点,依此类推。

For your example you would end up with this expression:对于您的示例,您最终会得到以下表达式:

A3 = 100 AND A1 IN (1, 2) AND A2 IN (10, 20, 40, 50) OR
A3 = 200 AND A1 IN (1, 3) AND A2 IN (10, 30)

This process is pretty simple and could also shorten the expression, not only optimize it for evaluation.这个过程非常简单,还可以缩短表达式,不仅可以优化它以进行评估。 It might not be perfect, but it could be a way to start.它可能并不完美,但它可能是一种开始。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM