简体   繁体   English

将逻辑表达式存储到RDBMS中的结构

[英]Structure to store logical expressions into RDBMS

Consider the following variables are generated by the player analyzer service: 请考虑播放器分析器服务生成以下变量:

    level = 6;
    errors = 4;
    score = 12;
    ...

And we have some rules and messages: 我们有一些规则和消息:

 1. errors == 0 AND level > 5 : Senior player
 2. score == 10 OR errors == 3: Border line player
 3. score > 10 AND score < 13: Not good, just passed
 4. ...

Now we should print proper messages. 现在我们应该打印正确的消息。

Another example: Consider the following variables are generated by the food analyzer service: 另一个示例:考虑以下由食品分析仪服务生成的变量:

    fruit = 2;
    coca = 6;
    ...

And we have some rules and messages: 我们有一些规则和消息:

 1. fruit == 0 : Consider buying some fruits
 2. coca == 0: That's healthy
 3. ...

Now we should print proper messages. 现在我们应该打印正确的消息。

How should I save rules and messages in a RDBMS like MySQL so it becomes easy to query and find the messages. 我应该如何在像MySQL这样的RDBMS中保存规则和消息,以便查询和查找消息变得容易。

The worst method is to save rules in one column and messages in another column and loading every record to test in host programming language. 最糟糕的方法是将规则保存在一个列中,将消息保存在另一列中,并加载每条记录以使用主机编程语言进行测试。

在此处输入图片说明 在此处输入图片说明

Can you suggest a better method for this situation? 您可以针对这种情况提出更好的方法吗? This isn't a good method when we have a few thousand messages, we need a method to filter messages on DB side. 当我们有数千条消息时,这不是一个好方法,我们需要一种在数据库端过滤消息的方法。

I've created a quick ERD to demonstrate how I'd initially design it: 我创建了一个快速的ERD来演示我最初是如何设计的: 在此处输入图片说明

What do all these columns and tables mean? 所有这些列和表是什么意思?

property_name property_name

This contains a list of everything that can have a value checked against it. 它包含可以检查其值的所有内容的列表。

  • property_id - the primary key property_id-主键
  • property_name - the text value of the item that has a value stored against it. property_name-具有针对其存储值的项目的文本值。 Examples would be "errors", "level", "fruit". 例如“错误”,“级别”,“水果”。

operator 算子

Contains a list of the different operators that are used for each property. 包含用于每个属性的不同运算符的列表。

  • operator_id - primary key. operator_id-主键。
  • operator_symbol - the symbol to use when checking for a value. operator_symbol-检查值时使用的符号。 I'm not sure if the actual symbol is the best value to store here, but it could work. 我不确定实际符号是否是存储在此处的最佳值,但是它可以工作。 Examples would be "==", ">", ">=". 示例为“ ==”,“>”,“> =”。

rule_message rule_message

Stores the actual message being displayed. 存储正在显示的实际消息。

  • rule_message_id - the primary key rule_message_id-主键
  • message - the text of the message to display. message-要显示的消息文本。 Examples would be "Senior player", "Consider buying fruits". 例如“高级玩家”,“考虑购买水果”。

operator_property operator_property

This is the joining table between all three other tables, and contains your rules and logic. 这是所有其他三个表之间的联接表,包含您的规则和逻辑。

  • property_operator_id - the primary key. property_operator_id-主键。 It's known as a surrogate key - you can exclude this column if you like and make the PK the (property_id, operator_id, rule_message_id) if that's what you prefer. 这就是代理键-如果您愿意,可以排除此列,如果需要,可以使PK为(property_id,operator_id,rule_message_id)。
  • property_id - the property_name record that is being used (eg the ID for "errors") property_id-正在使用的property_name记录(例如,“错误”的ID)
  • operator_id - the operator record that is being used (eg the ID for "==") operator_id-正在使用的操作员记录(例如,“ ==”的ID)
  • rule_message_id - the rule_message that is being used (eg the ID for "Senior player") rule_message_id-正在使用的rule_message(例如“高级玩家”的ID)
  • check_value - the value that is being checked against the property for the operator. check_value-对照操作员的属性检查的值。 Examples would be 6, 4, 12. 示例为6、4、12。

How to use this design: * You can add in all your properties and operators into the tables. 如何使用此设计:*您可以将所有属性和运算符添加到表中。 * To find the message to display for a scenario, such as checking what to show for a player: *要查找要显示的场景消息,例如检查要显示给玩家的内容:

SELECT rn.rule_message_id, rm.message
FROM rule_message rm
INNER JOIN operator_property op ON rm.rule_message_id = op.rule_message_id
INNER JOIN property_Name pn ON op.property_id = pn.property_id
INNER JOIN operator o ON op.operator_id = o.operator_id
WHERE 1=1
AND (
    pn.property_name = "errors"
    AND pn.operator_symbol = "=="
    AND op.check_value = 0
)
AND (
    pn.property_name = "level"
    AND pn.operator_symbol = "5"
    AND op.check_value = 5
)

This query would ideally return 1 row. 理想情况下,此查询将返回1行。 If it returns 0, then no messages apply. 如果返回0,则没有消息适用。 If it returns 2 or more, it means that it didn't neatly fit into one of your criteria, so none of the messages apply. 如果返回2或更大,则表示它不完全符合您的条件之一,因此没有消息适用。

Hope this helps! 希望这可以帮助! I've written articles on desigining databases before and the best tip I can give you is to work out the purpose of the data, which it seems like you already have. 之前,我已经写过有关数据库设计的文章,而我能给您的最好的建议是弄清楚数据的用途,看来您已经知道了。

Also, if you can think of better names for tables, then go for it - this was just a quick design to illustrate the point. 另外,如果您能想到更好的表名,那就继续吧-这只是一个快速的设计来说明这一点。

Generally, this kind of rule interpretation is not done directly in the database, and it will eventually be done in an interpreter like your check_rules_against_data , and that is absolutely fine. 通常,这种规则解释不是直接在数据库中完成的,最终将在诸如check_rules_against_data类的解释器中完成,这绝对可以。

It is quite common to just write all the rules directly in one or more php files (surrounded of course by some code like if ($rule) { echo $message; } ). 将所有规则直接直接写在一个或多个php文件中是很常见的(当然,某些代码会围绕这些代码,例如if ($rule) { echo $message; } )。 It is usually faster than to dynamically evaluate every rule every time (and keep in mind, the database will have to do just that too). 通常比每次动态评估每个规则要快(请记住,数据库也必须这样做)。 How you encode the filters depends on your needs; 过滤器的编码方式取决于您的需求; you can stick to your rule format, you could just show the full php code and let the user edit it, you could split them up and use the databasedesign to eg verify that a variable exists (see eg my extended rule_term -table below or completeitpro's answer). 您可以坚持使用规则格式,只显示完整的php代码并让用户对其进行编辑,可以将其拆分并使用databasedesign来验证变量是否存在(例如,参见下面我扩展的rule_term或completeitpro的回答)。 All of that would work just fine. 所有这些都可以正常工作。

If you want, or if you want to test it, you can however do some preselection in your database. 但是,如果需要或想要对其进行测试,则可以在数据库中进行一些预选择。 There are a lot of ways to do it, and a lot of ways to optimize it for special situations, that will massively depend on what you actually want to do, so I will just describe one way, to give you an idea. 有很多方法可以执行此操作,并且有很多方法可以针对特殊情况对其进行优化,这在很大程度上取决于您实际想要执行的操作,因此,我仅介绍一种方法,以给您一个想法。

Your variables look like you will have a ton of them, but all of them integer (so owning a coke doesn't mean: Items[x]='COCA' , but coca=1 ), so you can put them and the rules in tables like this: 您的变量看起来将有很多,但是它们都是整数(因此拥有可乐并不意味着: Items[x]='COCA' ,但coca=1 ),因此您可以放置​​它们和规则在这样的表中:

variable 变量

variableid | variablename | variabletype
----------------------------------------
1          | errors       | 1
2          | level        | 1 
3          | score        | 1 

user_variable user_variable

userid     | variableid  | valueint  
-------------------------------------
1          | 1           | 0         
1          | 2           | 6         
1          | 3           | 10         
2          | 1           | 3         
2          | 3           | 10        
3          | 1           | 0         
3          | 2           | 6         
3          | 3           | 10         
4          | 1           | 0         
4          | 2           | 5         

rule 规则

ruleid | mincount | message
---------------------------
1      | 2        | Senior player          -> AND (2 terms have to fit)
2      | 1        | Border line player     -> OR (any 1 term can fit)

rule_term 规则项

ruleid | variableid | minvalueint | maxvalueint
-----------------------------------------------
1      | 1          | 0           | 0            -> error == 0
1      | 2          | 6           | 9999         -> level > 5
2      | 1          | 3           | 3            -> error == 3
2      | 3          | 10          | 10           -> score == 10

With these rules, you can now preselect rules that hit: 使用这些规则,您现在可以预选符合以下条件的规则:

select user_variable.userid, rule.ruleid, count(*) as cntfulfilled, 
       max(rule.mincount) as mincnt, max(rule.message) as message
from rule_term
join rule
on rule_term.ruleid = rule.ruleid
join user_variable 
on rule_term.variableid = user_variable.variableid
and rule_term.minvalueint <= user_variable.valueint 
and rule_term.maxvalueint >= user_variable.valueint
group by user_variable.userid, rule.ruleid
having count(*) >= max(rule.mincount);

This should count for each user and each rule, how many subterms of this rule are fulfilled. 对于每个用户和每个规则,该规则应满足多少子项。 This should be, if I'm not mistaken: 如果我没有记错的话,应该是这样:

userid | ruleid | cntfulfilled | mincnt | message
--------------------------------------------------
1      | 1      | 2            | 2      | Senior player
1      | 2      | 1            | 1      | Border line player
2      | 2      | 2            | 1      | Border line player
3      | 1      | 2            | 2      | Senior player

To express AND , mincnt should be the number of all subterms, for OR , it will be 1. To build rules with either plain AND or OR , this will already be the complete test. 为了表示ANDmincnt应该是所有子项的数目,对于OR ,它应该是1。要使用普通ANDOR构建规则,这已经是完整的测试。

For more complicated rules, you have to be able to recreate the rule in php to put it in your check-function. 对于更复杂的规则,您必须能够在php中重新创建规则以将其放入检查功能。 You can eg encode it in a table like: 您可以例如在一个表中对其进行编码:

extended rule_term -table: 扩展rule_term

ruleid | pos | cond | var.id | min | max
--------------------------------------------
3      | 1   | 1    | 0      | 0   | 0     -> (
3      | 2   | 0    | 1      | 1   | 1     -> error == 1
3      | 3   | 4    | 2      | 5   | 5     -> AND level == 5
3      | 4   | 2    | 0      | 0   | 0     -> )
3      | 5   | 5    | 3      | 10  | 10    -> OR score == 10

where I used cond=1: (, cond=2: ), cond=3: NOT, cond=4: AND, cond=5: OR. 我使用cond = 1:(,cond = 2:),cond = 3:NOT,cond = 4:AND,cond = 5:OR。 (There are better ways to encode it, eg express just the logic and group it in nested AND -subgroups, but it will not improve anything here). (有更好的方法对其进行编码,例如仅表达逻辑并将其分组为嵌套的AND -subgroups,但在这里不会有任何改善)。

This will allow you to still preselect rules that might fit, to get the rules you have to analyze afterwards in php (you cannot use mincnt anymore, since mincnt will be 1 even if just error == 1 , not just when score == 10 ). 这将允许您仍然预先选择可能适合的规则,以获取随后必须在php中进行分析的规则(您不能再使用mincnt,因为即使error == 1 ,mincnt仍为error == 1 ,而不仅仅是score == 10 )。

You can add more things to it: you can add string variable types (add a column valuestr to user_variable and rule_term and adjust the joins) or a flag for 'NOT', and you can add more complicated copnditions to your join if you are able to express them in rows in the rule_term-table (eg combine 2 variables and check for 2 variables in a double join). 您可以向其中添加更多内容:您可以添加字符串变量类型(向user_variablerule_term添加列valuestr并调整连接)或“ NOT”的标记,并且如果可以,可以向连接中添加更复杂的连接条件表示它们在rule_term-table中的行中(例如,组合2个变量并在双联接中检查2个变量)。

It's a little bit harder, but you might want to use left joins and some additional logic to compare variables that are not there (eg if you don't want to set the variable coca for everyone, just for users that have (or had) coke. 这有点困难,但是您可能想使用左联接和一些其他逻辑来比较不存在的变量(例如,如果您不想为所有人(仅针对拥有(或曾经拥有)的用户)设置变量coca可乐。

If you want to use horizontal variables (a fixed number of variables, each in a column), you should do the same for the rule-terms (a column min/max for each variable) and adjust the joins to check every column. 如果要使用水平变量(一列中有固定数量的变量,每列中有一个变量),则应对规则项(每个变量的一列min / max)执行相同的操作,并调整联接以检查每一列。

This is just a general idea, and you obviously have tons of alternatives to do that, and the best option and optimization will largely depend on your actual needs, and spending more time thinking about your databasedesign (or on how to generate dynamic php files) will later reduce frustration (a lot) or increase speed (a lot). 这只是一个一般性的想法,显然您有很多其他选择可以做到,而最佳选择和优化将在很大程度上取决于您的实际需求,并花费更多时间考虑数据库设计(或如何生成动态php文件)稍后会减少挫折感(很多)或提高速度(很多)。 And I will remind you again, test the option to generate dynamic php files - this will usually be a lot faster. 我会再次提醒您,测试生成动态php文件的选项-这通常会快很多。

This is the classic case for a rule system, and should likely not be implemented in the database. 这是规则系统的典型情况,不应在数据库中实现。 I put together a java library ( Rulette ) which does pretty much this. 我整理了一个Java库( Rulette ),它可以完成很多工作。

Essentially you would set it up by creating a rule_system table and inserting an entry to it, and creating a rule input table with your entries (level, error, score). 从本质上讲,您可以通过创建rule_system表并向其中插入一个条目,然后使用条目(级别,错误,分数)创建规则输入表来进行设置。 By your samples, level and error seems to be 'VALUE' types while 'score' seems to be a 'RANGE' type. 根据您的样本,级别和错误似乎是“ VALUE”类型,而“得分”似乎是“ RANGE”类型。

Now you can create a rule table ('player_rules {id, level, error, score}') to configure all your rules and map them to entries an output table ('player_message {id, message}'). 现在,您可以创建一个规则表(“ player_rules {id,级别,错误,分数}”)来配置所有规则,并将它们映射到输出表的条目(“ player_message {id,message}”)。

Good to go!! 好去!

RuleSystem rs = new RuleSystem("player-rule-system");
Rule r = rs.getRule(new HasMap<>(){"level":level, "error: : error, "score" : score});

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM