简体   繁体   English

上下文无关语法表示正则表达式

[英]Context-free-grammar to represent regular expressions

I'm trying to make a context-free-grammar to represent simple regular expressions. 我正在尝试使上下文无关的语法来表示简单的正则表达式。 The symbols that I want is [0-9][az][AZ], and operators is "|", "()" and "." 我想要的符号是[0-9] [az] [AZ],运算符是“ |”,“()”和“”。 for concatenation, and for sequences for now I only want "*" later I will add "+","?", etc. I tried this grammar in javacc: 对于串联,对于现在的序列,我只想要“ *”,以后我将添加“ +”,“?”等。我在javacc中尝试了此语法:

void RE(): {}
{
    FINAL(0) ( "." FINAL(0) | "|" FINAL(0))*
}

void FINAL(int sign): { Token t; }
{
    t = <SYMBOL> {
        if ( sign == 1 )
            jjtThis.val = t.image + "*";
        else
            jjtThis.val = t.image;
    }
    | FINAL(1) "*"
    | "(" RE() ")"
}

The problem is in FINAL function the line | FINAL(1) "*" 问题是最终功能线| FINAL(1) "*" | FINAL(1) "*" that gives me a error Left recursion detected: "FINAL... --> FINAL... . Putting "*" on the left of FINAL(1) resolve the problem but this is not what I want.. | FINAL(1) "*"给我一个错误Left recursion detected: "FINAL... --> FINAL...在FINAL(1)左侧放置” *“可以解决问题,但这不是我要解决的问题想..

I already tried to read the article from wikipedia to remove left recursion but I really don't know how to do it, can someone help? 我已经尝试过阅读Wikipedia上的文章,以消除左递归,但是我真的不知道该怎么做,有人可以帮忙吗? :s :s

The following takes care of the left recursion 以下照顾左递归

RE --> FACTOR ("." FINAL | "|" FINAL)*
FINAL --> PRIMARY ( "*" )*
PRIMARY --> <SYMBOL> | "(" RE ")"

However, that won't give . 但是,那不会给。 precedence over | 优先于| . For that you can do the following 为此,您可以执行以下操作

RE --> TERM ("|" TERM)*
TERM --> FINAL ("." FINAL)*
FINAL --> PRIMARY ( "*" )*
PRIMARY --> <SYMBOL> | "(" RE ")"

The general rule is 一般规则是

A --> A b | c | d | ...

can be transformed to 可以转化为

A --> B b*
B --> c | d | ...

where B is a new nonnterminal. 其中B是一个新的非终结符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM