简体   繁体   English

C 中 shell 的 LL1 语法不明确或冲突

[英]ambiguous or conflict in LL1 grammar for a shell in C

i'm implementing a LL(1) parser for a project of doing a shell implementation.我正在为执行 shell 实现的项目实现 LL(1) 解析器。 i'm stuck trying to resolve conflicts in my grammar:我一直在尝试解决语法中的冲突:

Parsing mode: LL(1).

Grammar:

     1. COMMAND_LINE -> COMPLETE_COMMAND PIPED_CMD
     2. PIPED_CMD -> PIPE COMPLETE_COMMAND PIPED_CMD
     3.            | ε
     4. COMPLETE_COMMAND -> CMD_PREFIX CMD CMD_SUFFIX
     5. CMD_PREFIX -> REDIRECTION CMD_PREFIX
     6.             | ε
     7. CMD_SUFFIX -> REDIRECTION CMD_SUFFIX
     8.             | CMD_ARG CMD_SUFFIX
     9.             | ε
    10. REDIRECTION -> REDIRECTION_OP WORD
    11.              | ε
    12. CMD -> WORD
    13. CMD_ARG -> WORD CMD_ARG
    14.          | SINGLE_QUOTE WORD DOUBLE_QUOTE CMD_ARG
    15.          | DOUBLE_QUOTE WORD DOUBLE_QUOTE CMD_ARG
    16.          | ε
    17. REDIRECTION_OP -> HERE_DOC
    18.                 | APPEND
    19.                 | INFILE
    20.                 | OUTFILE

i use syntax-cli to check my grammar, and the ll(1) parser is a home made implementation, i can link my implementation of the parser if needed.我使用 syntax-cli 检查我的语法,而 ll(1) 解析器是一个自制的实现,如果需要我可以链接我的解析器实现。 the conflict detected by syntax-cli are: syntax-cli 检测到的冲突是:

PIPE管道 WORD单词 SINGLE_QUOTE单引号 DOUBLE_QUOTE双引号 HERE_DOC HERE_DOC APPEND附加 INFILE输入文件 OUTFILE外档 $ $
CMD_SUFFIX CMD_SUFFIX 9 9 7/8 7/8 7/8 7/8 7/8 7/8 7/8 7/8 7/8 7/8 7/8 7/8 7/8 7/8 9 9
REDIRECTION重定向 11 11 11 11 11 11 11 11 10/11 10/11 10/11 10/11 10/11 10/11 10/11 10/11 11 11
CMD_ARG命令参数 16 16 13/16 13/16 14/16 14/16 15/16 15/16 16 16 16 16 16 16 16 16 16 16

i've also tried this grammar:我也试过这个语法:


COMMAND_LINE     
                 : COMPLETE_COMMAND PIPED_CMD
                 ;
PIPED_CMD        
                 : PIPE COMPLETE_COMMAND PIPED_CMD
                 |
                 ;
COMPLETE_COMMAND 
                 : REDIRECTION CMD REDIRECTION CMD_ARG REDIRECTION
                 ;
REDIRECTION      
                 : REDIRECTION_OP WORD
                 | 
                 ;

CMD              
                 : WORD
                 ;
CMD_ARG          
                 : WORD REDIRECTION CMD_ARG
                 | SINGLE_QUOTE WORD DOUBLE_QUOTE REDIRECTION CMD_ARG
                 | DOUBLE_QUOTE WORD DOUBLE_QUOTE REDIRECTION CMD_ARG
                 | REDIRECTION
                 ;
REDIRECTION_OP
                 : HERE_DOC
                 | APPEND
                 | INFILE
                 | OUTFILE
                 ;

but the parser don't work when using multiple redirections...但是解析器在使用多个重定向时不起作用......

Without more specification on your behalf, can't be sure to have it all.如果没有代表您的更多规范,则无法确保拥有全部。 But indeed, this grammar is ambiguous.但事实上,这个语法是有歧义的。

To build a LL(1) analyzer, you must be able to say, for any combination of symbol on the analyzer stack (symbol being either a terminal or non-terminal yet to read) and any word from the input buffer, what rule should apply.要构建 LL(1) 分析器,您必须能够说出,对于分析器堆栈上的符号的任意组合(符号是终端或非终端尚未读取)和输入缓冲区中的任何单词,应该采用什么规则申请。

Put yourself in the situation where you code starts with a WORD (that is first thing that is in input buffer)将自己置于代码以WORD开头的情况(这是输入缓冲区中的第一件事)

You start by trying to analyze COMMAND_LINE您首先尝试分析COMMAND_LINE

If input buffer starts with WORD , then only one rule can lead to COMMAND_LINE , that is the rule COMPLETE_COMMAND PIPED_CMD (anyway, whatever input, there is only this rule. Either we can apply it, or it is a syntax error. But for now, no reason to raise a syntax error, this rule is compatible with a start with WORD ).如果输入缓冲区以WORD开头,那么只有一个规则可以导致COMMAND_LINE ,即规则COMPLETE_COMMAND PIPED_CMD (无论如何,无论输入什么,只有这条规则。要么我们可以应用它,要么它是语法错误。但现在, 没有理由引发语法错误,此规则与以WORD开头的规则兼容)。

So, now, on your stack you have COMPLETE_COMMAND PIPED_CMD , and in input buffer, still the same WORD .所以,现在,在你的堆栈上你有COMPLETE_COMMAND PIPED_CMD ,并且在输入缓冲区中,仍然是相同的WORD

The only possible rule for the top of the stack is COMPLETE_COMMAND -> CMD_PREFIX CMD CMD_SUFFIX堆栈顶部唯一可能的规则是COMPLETE_COMMAND -> CMD_PREFIX CMD CMD_SUFFIX

So, now, on your stack you have CMD_PREFIX CMD CMD_SUFFIX PIPED_CMD .所以,现在,在你的堆栈上你有CMD_PREFIX CMD CMD_SUFFIX PIPED_CMD

And waiting in input buffer WORD并在输入缓冲区WORD中等待

2 rules can be applied from CMD_PREFIX :可以从CMD_PREFIX应用 2 条规则:
CMD_PREFIX -> REDIRECTION CMD_PREFIX
or CMD_PREFIX -> εCMD_PREFIX -> ε

None of them can start with WORD .他们都不能以WORD开头。 So either we say that what we have here is an empty CMD_PREFIX (followed by a CMD starting with WORD )所以要么我们说我们这里有一个空的CMD_PREFIX (后面是一个以WORD开头的CMD

Or we can see it as a REDIRECTION followed by an empty prefix.或者我们可以将其视为后跟空前缀的REDIRECTION REDIRECTION can be REDIRECTION -> ε REDIRECTION可以是REDIRECTION -> ε

So both are possible at this point.所以在这一点上两者都是可能的。 Either we have a CMD_PREFIX(ε) or we have a CMD_PREFIX(REDIRECTION(ε), ε) (or even more recursions).要么我们有一个CMD_PREFIX(ε)要么我们有一个CMD_PREFIX(REDIRECTION(ε), ε) (或者更多的递归)。

For the grammar to be LL(1), we should not have to go deeper to decide.对于要成为 LL(1) 的文法,我们不应该更深入地决定。 From this point, with the only knowledge that next lexem is WORD , we should be able to choose among those too.从这一点来看,只要知道下一个词位是WORD ,我们也应该能够在其中进行选择。 We aren't.我们不是。

(In fact, even with other grammar than LL(1), we couldn't decide) (事实上 ,即使使用 LL(1) 以外的其他语法,我们也无法决定)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM