简体   繁体   English

用于验证数学方程的正则表达式

[英]Regular expression for validating math equation

I'm trying to build a regex that validates a math equation. 我正在尝试构建一个验证数学方程式的正则表达式。 The equation itself is very simple, I'm looking to make an English readable equation, that I will later return as true or false. 方程式本身非常简单,我希望制作一个英文可读的方程式,以后我将返回真或假。 An example would be like so. 一个例子就是这样。

((1 and 2) or 3) ((1和2)或3)

In this example, I will swap out any numbers with either true, or false. 在这个例子中,我将换出任何带有true或false的数字。 I will also replace, "and" with "&&" and "or" with "||" 我也会用“&&”替换“和”,用“||”替换“或” in order to run the equation using PHP. 为了使用PHP运行方程式。 The response to this will ultimately be either true, or false. 对此的回应最终将是真实的,或者是错误的。

An example final equation would look something like this: 示例最终等式看起来像这样:

((true && true) || true) ((true && true)|| true)

Here are some more examples of what should be considered valid. 以下是一些应该被认为有效的例子。

(1 or 2 or 3) (1或2或3)

((1 and 2 and 3) or (4 and 5)) ((1和2和3)或(4和5))

So, my question comes in two parts. 所以,我的问题分为两部分。

  1. Is it possible to create a regex expression to validate all possible valid equations? 是否可以创建一个正则表达式来验证所有可能的有效方程? One big road block for me is understanding how I could validate that all opening "(" also have the closing ")". 对我来说,一个重要的障碍是理解我如何验证所有开放的“(”也有结束“)”。
  2. Is it advisable to use a regex expression in order to validate client side in this circumstance? 在这种情况下,是否建议使用正则表达式来验证客户端? I am already able to validate the expression using AJAX and PHP, so am I just overthinking this? 我已经能够使用AJAX和PHP验证表达式,所以我只是想过这个吗?

Using pumping lemma it can be easily proven that the strings you want to validate belong to a language that is not regular so it cannot be parsed using regular expressions. 使用抽取引理可以很容易地证明您要验证的字符串属于非常规language ,因此无法使用正则表达式进行解析。 (Actually in the way of proving this The fact that you cannot match opening and closing parenthesis or even count them is used - as you mentioned in first part of your question) Although some regex engines may provide some additional functionalities that can be used for parsing this (like recursive patterns) but it's not 100% in accordance with the formal definition of regular expressions. (实际上是以证明这一点的方式使用你无法匹配开括号或右括号的事实或者使用它们 - 正如你在问题的第一部分中提到的那样)虽然一些正则表达式引擎可能提供一些可用于解析的附加功能这(就像递归模式一样)但是它不是100%符合正则表达式的正式定义。

You may consider parsing the parenthesis yourself and validating the expression inside them using simple regular expressions, or you can use a parse tree , similar to what compilers do. 您可以考虑自己解析括号并使用简单的正则表达式验证其中的表达式,或者您可以使用解析树 ,类似于编译器所做的。

This can be done in php (uses PCRE engine). 这可以在php中完成(使用PCRE引擎)。
Below is just an example. 以下只是一个例子。
You could comment out the errors check, then insert boundary constructs 您可以注释掉错误检查,然后插入边界构造
around the regex to make it definitively pass/fail. 在正则表达式周围,使它明确通过/失败。

The biggest problem is not the recursion, but defining the content boundary 最大的问题不是递归,而是定义内容边界
conditions. 条件。 I've pretty much boiled it down for you. 我差点把它煮熟了。 These checks have to 这些检查必须
be maintained any how you do it, state, stacks ..., its all the same. 保持任何你如何做,状态,堆栈......,它都是一样的。

( This regex was constructed and tested using RegexFormat 6 ) (这个正则表达式是使用RegexFormat 6构建和测试的)

Sample input: 样本输入:

 (((   (1 and 2 and 3) or (9) or ( ( 4 and 5)) and 5 ) and   7) )

Tested output: 测试输出:

 **  Grp 0 -  ( pos 0 , len 64 ) 
(((   (1 and 2 and 3) or (9) or ( ( 4 and 5)) and 5 ) and   7) )  
 **  Grp 1 -  ( pos 1 , len 62 ) 
((   (1 and 2 and 3) or (9) or ( ( 4 and 5)) and 5 ) and   7)   
 **  Grp 2 -  NULL 
 **  Grp 3 -  NULL 
 **  Grp 4 -  NULL 

Regex: 正则表达式:
5/29 All Forms: 5/29所有表格:

Empty form ( ) not allowed 空表格( )不允许
Empty form ) ( not allowed 空表格) (不允许
Form ) and ( ok 表格) and (好的
Form ) and 2 and ( ok 表格) and 2 and (好的
Form ( 1 and 2 ) ok 表格( 1 and 2 )确定
Form ( 1 ) ok 表格( 1 )好的
Form ) and 2 ) ok 表格) and 2 )确定
Form ( 1 and ( ok 表格( 1 and (好的
Form ( whitespace ( or ) whitespace ) ok 表格( whitespace () whitespace )确定

 # (?s)(?:\(((?!\s*\))(?&core))\)|\s*([()]))(?(DEFINE)(?<core>(?>(?&content)|\((?:(?!\s*\))(?&core))\)(?!\s*\())+)(?<content>(?>(?<=\))\s*(?:and|or)\s*(?=\()|(?<=\))\s*(?:(?:and|or)\s+\d+)+\s*(?:and|or)\s*(?=\()|(?<=\()\s*\d+(?:(?:\s+(?:and|or)\s+)?\d+)*\s*(?=\))|(?<=\))\s*(?:(?:and|or)\s+\d+)+\s*(?=\))|(?<=\()\s*(?:\d+\s+(?:and|or))+\s*(?=\()|\s+)))


 # //////////////////////////////////////////////////////
 # // The General Guide to 3-Part Recursive Parsing
 # // ----------------------------------------------
 # // Part 1. CONTENT
 # // Part 2. CORE
 # // Part 3. ERRORS

 (?s)                       # Dot-All modifier (used in a previous incarnation)

 (?:
      #           (                          # (1), Take off CONTENT (not used here)
      #                (?&content) 
      #           )
      #        |                           # OR

      \(                         # Open Paren's
      (                          # (1), parens CORE
           (?! \s* \) )               # Empty form '( )' not allowed
           (?&core) 
      )
      \)                         # Close Paren's
   |                           # OR
      \s* 
      (                          # (2), Unbalanced (delimeter) ERRORS
                                      # - Generally, on a whole parse, these
                                      #   are delimiter or content errors
           [()]                       
      )
 )

 # ///////////////////////
 # // Subroutines
 # // ---------------

 (?(DEFINE)

      # core
      (?<core>                   # (3)
           (?>
                (?&content) 
             |  
                \(                         # Open Paren's
                (?:
                     (?! \s* \) )               # Empty form '( )' not allowed
                     (?&core) 
                )
                \)                         # Close Paren's
                (?! \s* \( )               # Empty form ') (' not allowed
           )+
      )

      # content 
      (?<content>                # (4)
           (?>
                (?<= \) )                  # Form ') and ('
                \s* 
                (?: and | or )
                \s* 
                (?= \( )

             |  
                (?<= \) )                  # Form ') and 2 and ('
                \s* 
                (?:
                     (?: and | or )
                     \s+ 
                     \d+ 
                )+
                \s* 
                (?: and | or )
                \s* 
                (?= \( )

             |  
                (?<= \( )                  # Form '( 1 and 2 )'
                \s* 
                \d+ 
                (?:
                     (?:
                          \s+ 
                          (?: and | or )
                          \s+ 
                     )?
                     \d+ 
                )*
                \s* 
                (?= \) )

             |  
                (?<= \) )                  # Form ') and 2 )'
                \s* 
                (?:
                     (?: and | or )
                     \s+ 
                     \d+ 
                )+
                \s* 
                (?= \) )

             |  
                (?<= \( )                  # Form '( 1 and ('
                \s* 
                (?:

                     \d+ 
                     \s+ 
                     (?: and | or )
                )+
                \s* 
                (?= \( )

             |  
                \s+                        # Interstitial whitespace
                                           # '( here (' or ') here )'
           )
      )

 )

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM