简体   繁体   English

布尔逻辑的正则表达式

[英]Regex for boolean logic

I am trying to use a regex to validate a string. 我正在尝试使用正则表达式来验证字符串。 It should allow white spaces between a string and a booleaen operator like (@string1 OR) , but disallow white spaces in between strings like (string 1) . 它应该在字符串和booleaen运算符(@string1 OR)(@string1 OR)之间允许使用空格,但不允许在字符串之间(string 1)空格。 Other boolean logics allowed are: 允许的其他布尔逻辑为:

(A AND B) AND (NOT C)
(A OR B) AND (NOT C)
(A AND B)
(A OR B)
(NOT C)

Examples of possible valid and invalid inputs are below. 可能的有效和无效输入示例如下。

Valid: 有效:

(@string1 OR @string2) AND ( NOT @string3)
(@string-1 AND @string.2) AND ( NOT @string_3)
(@string1 OR @string2 OR @string4) AND ( NOT @string3 AND NOT @string5)
(@string1    OR   @string2   OR    @string4)
(@string1 AND @string2 AND @string4)
( NOT @string1 AND NOT @string2 AND NOT @string4)
( NOT @string1 AND NOT @string2)

Invalid: 无效:

()
(string  1 OR @str ing2) AND ( NOT @tag3)
(@string 1 OR @tag 2) AND ( NOT @string 3)
(@string1  @string2) ( NOT @string3)
(@string1 OR @string12) AND (@string3)
(@string1 AND NOT @string2)

Is it better to parse the string and then have multiple regexes check for the absence of whitespaces, or can a regex be written to check the entire string? 解析字符串,然后让多个正则表达式检查是否存在空格是否更好,还是可以编写正则表达式来检查整个字符串?

This kind of sophisticated validation would be best solved with a grammar parser. 这种复杂的验证最好用语法分析器解决。

Just to get you started, here is an (incomplete) solution in parslet. 只是为了让您入门,这是parslet中的(不完整)解决方案。 As you can see, you build up from primitives and construct more and more complicated structures. 如您所见,您是从基元开始构建的,并构造了越来越复杂的结构。

require 'parslet'

class Boolean < Parslet::Parser
  rule(:space)  { match[" "].repeat(1) }
  rule(:space?) { space.maybe }

  rule(:lparen) { str("(") >> space? }
  rule(:rparen) { str(")") >> space? }

  rule(:and_operator) { str("AND") >> space? }
  rule(:or_operator) { str("OR") >> space? }
  rule(:not_operator) { str("NOT") >> space? }

  rule(:token) { str("@") >> match["a-z0-9"].repeat >> space? }

  # The primary rule deals with parentheses.
  rule(:primary) { lparen >> expression >> rparen | token }

  rule(:and_expression) { primary >> and_operator >> primary }
  rule(:or_expression) { primary >> or_operator >> primary }
  rule(:not_expression) { not_operator >> primary }

  rule(:expression) { or_expression | and_expression | not_expression | primary }

  root(:expression)
end

You can test a string with this little helper method: 您可以使用以下辅助方法测试字符串:

def parse(str)
  exp = Boolean.new
  exp.parse(str)
  puts "Valid!"
rescue Parslet::ParseFailed => failure
  puts failure.parse_failure_cause.ascii_tree
end

parse("@string AND (@string2 OR @string3)")
#=> Valid!
parse("(string1 AND @string2)")
#=> Expected one of [OR_EXPRESSION, AND_EXPRESSION, NOT_EXPRESSION, PRIMARY] at line 1 char 1.
#   ...
#   - Failed to match sequence ('@' [a-z0-9]{0, } SPACE?) at line 1 char 2.
#      - Expected "@", but got "s" at line 1 char 2.

您需要递归或循环,并且要正确解析堆栈和单独使用正则表达式将非常困难,尽管无法进行验证。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM