简体   繁体   English

将正则表达式转换为PegJs语法

[英]Convert regular expression to PegJs Grammar

I'm new to PEGjs and I'm trying to write a PEGjs grammar convert the RegEx (\\s*[\\(])|(\\s*[\\)])|(\\"[^\\(\\)]+?\\")|([^\\(\\)\\s]+) to grammar. 我是PEGjs的新手,正在尝试编写PEGjs语法来转换RegEx (\\s*[\\(])|(\\s*[\\)])|(\\"[^\\(\\)]+?\\")|([^\\(\\)\\s]+)语法。

Basically what I'm trying to do is transform the test input 基本上我想做的就是转换测试输入

(App= smtp AND "SPort" != 25) OR (App= pop3 AND "SPort" != 110) OR (App = imap AND "SPort" != 143) AND (App= imap OR "SPort" != 143)

to a json format as below 转换为json格式,如下所示

{
  "eventTypes": [
    "All"
  ],
  "condition": {
    "operator": "and",
    "terms": [
      {
        "operator": "or",
        "terms": [
          {
            "operator": "or",
            "terms": [
              {
                "operator": "and",
                "terms": [
                  {
                    "name": "App",
                    "operator": "equals",
                    "value": "smtp"
                  },
                  {
                    "name": "Sport",
                    "operator": "notEquals",
                    "value": "25"
                  }
                ]
              },
              {
                "operator": "and",
                "terms": [
                  {
                    "name": "App",
                    "operator": "equals",
                    "value": "pop3"
                  },
                  {
                    "name": "Sport",
                    "operator": "notEquals",
                    "value": "110"
                  }
                ]
              }
            ]
          },
          {
            "operator": "and",
            "terms": [
              {
                "name": "App",
                "operator": "equals",
                "value": "imap"
              },
              {
                "name": "Sport",
                "operator": "notEquals",
                "value": "143"
              }
            ]
          }
        ]
      },
      {
        "operator": "or",
        "terms": [
          {
            "name": "App",
            "operator": "equals",
            "value": "imap"
          },
          {
            "name": "Sport",
            "operator": "notEquals",
            "value": "143"
          }
        ]
      }
    ]
  }
}

I have written a bit complex javascript code to transform the sample input to the JSON format show about but the code is bit complicated and not easy to maintain in the long term so I thought to give a try a grammar parser. 我编写了一些复杂的JavaScript代码,以将示例输入转换为JSON格式的节目,但是该代码有点复杂,从长远来看不容易维护,因此我想尝试一下语法解析器。 Since I'm new to grammar world, I seek some help or guidance to implement a grammar that does the above so I can enhance/write as needed? 由于我是语法世界的新手,因此我寻求一些帮助或指导来实现上述语法,以便我可以根据需要进行增强/编写?

You can see the output of the Regex here 您可以在此处查看正则表达式的输出

EDIT 编辑

Javascript solution: JavaScript解决方案:

 var str = '((Application = smtp AND "Server Port" != 25) AND (Application = smtp AND "Server Port" != 25)) OR (Application = pop3 AND "Server Port" != 110) OR (Application = imap AND "Server Port" != 143) AND (Application = imap OR "Server Port" != 143)';

var final = str.replace(/\((?!\()/g,"['")        //replace ( with [' if it's not preceded with (
           .replace(/\(/g,"[")               //replace ( with [
           .replace(/\)/g,"']")              //replace ) with '] 
           .replace(/\sAND\s/g,"','AND','")  //replace AND with ','AND','
           .replace(/\sOR\s/g,"','OR','")    //replace OR with ','OR','
           .replace(/'\[/g,"[")              //replace '[ with [
           .replace(/\]'/g,"]")              //replace ]' with ]
           .replace(/"/g,"\\\"")             //escape double quotes
           .replace(/'/g,"\"");              //replace ' with "
console.log(JSON.parse("["+final+"]"))

To the best of my knowledge, you cannot get exactly the result you want because it would require an infinite loop. 据我所知,您无法完全获得所需的结果,因为这需要无限循环。 Specifically, given the following input: 具体来说,给出以下输入:

A OR B OR C

You are asking for this output: 您要求此输出:

(A OR B) OR C

To get this result, you'd need to have a rule like this: 要获得此结果,您需要具有如下规则:

BOOL = left:( BOOL / Expression ) "OR" right:( Expression )

This creates an infinite loop, as BOOL can never be resolved. 这将创建一个无限循环,因为BOOL永远无法解决。 BOOL cannot be resolved because the first rule in BOOL is to match BOOL. 无法解析BOOL,因为BOOL中的第一个规则是匹配BOOL。 However, we can get 但是,我们可以得到

A OR ( B OR C )

because 因为

BOOL = left:( Expression ) "OR" right:( BOOL / Expression )

does not create an infinite loop. 不会创建无限循环。 This is because we can begin to match something before recursing back into BOOL. 这是因为我们可以先匹配某些内容,然后再返回到BOOL。 It's a little heady, I know, but trust me... you've got to have something for PegJS to start matching before you can recurse. 我知道,这有点令人头疼,但是请相信我……在递归之前,您必须 PegJS开始匹配。

If this is acceptable, then I believe this grammar would get you pretty close to the desired output: 如果这是可以接受的,那么我相信这种语法将使您非常接近所需的输出:

// Our top-level rule is Expression
Expression
  = BOOL
  / SubExpression
  / Comparison
  / Term

// A sub expression is just an expression wrapped in parentheses
// Note that this does not cause an infinite loop because the first term is always "("
SubExpression
  = _ "(" _ innards: Expression _ ")" _ { return innards; }

Comparison
  = name:Term _ operator:("=" / "!=") _ value:Term {
      return {
        name: name,
        operator: operator === '=' ? 'equals' : 'notEquals',
        value: value,
      };
    }

BOOL = AND / OR

// We separate the AND and OR because we want AND to take precendence over OR
AND
  = _ left:( OR / SubExpression / Comparison ) _ "AND" _ right:( AND / OR / SubExpression / Comparison ) _ {
    return {
      operator: 'and',
      terms: [ left, right ]
    }
  }

OR
  = _ left:( SubExpression / Comparison ) _ "OR" _ right:( OR / SubExpression / Comparison ) _ {
    return {
      operator: 'or',
      terms: [ left, right ]
    }
  }

Term
  = '"'? value:$( [0-9a-zA-Z]+ ) '"'? {
      return value;
    }

Integer "integer"
  = _ [0-9]+ { return parseInt(text(), 10); }

_ "whitespace"
  = [ \t\n\r]*

Given your input, we'd get: 有了您的输入,我们将获得:

{
   "operator": "and",
   "terms": [
      {
         "operator": "or",
         "terms": [
            {
               "operator": "and",
               "terms": [
                  {
                     "name": "App",
                     "operator": "equals",
                     "value": "smtp"
                  },
                  {
                     "name": "SPort",
                     "operator": "notEquals",
                     "value": "25"
                  }
               ]
            },
            {
               "operator": "or",
               "terms": [
                  {
                     "operator": "and",
                     "terms": [
                        {
                           "name": "App",
                           "operator": "equals",
                           "value": "pop3"
                        },
                        {
                           "name": "SPort",
                           "operator": "notEquals",
                           "value": "110"
                        }
                     ]
                  },
                  {
                     "operator": "and",
                     "terms": [
                        {
                           "name": "App",
                           "operator": "equals",
                           "value": "imap"
                        },
                        {
                           "name": "SPort",
                           "operator": "notEquals",
                           "value": "143"
                        }
                     ]
                  }
               ]
            }
         ]
      },
      {
         "operator": "or",
         "terms": [
            {
               "name": "App",
               "operator": "equals",
               "value": "imap"
            },
            {
               "name": "SPort",
               "operator": "notEquals",
               "value": "143"
            }
         ]
      }
   ]
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM