[英]interpret everything after LeftBracket as a string until next RightBracket
[Solution] on the bottom under Edit3 [解决方案]在Edit3下面的底部
I am currently developring a new grammar (from certain requirements which I cannot change) and the following requirement poses a problem, I cannot solve at the moment. 我目前正在开发一种新的语法(从我无法改变的某些要求),以下要求带来了一个问题,我现在无法解决。 I am using Antlr4 with the C# target. 我正在使用Antlr4和C#目标。
The syntax is as follows: 语法如下:
print [blabla ]
so everything inside the brackets is considered a string. 因此括号内的所有内容都被视为字符串。 So also this: 这也是这样的:
print [3 + 2]
will print 将打印
3 + 2
Now I have lexer rules which will obviously match the 3 as an Integer. 现在我有lexer规则,显然将3作为整数匹配。 So how can I create a parser rule which will parse anything until a ']' is found? 那么如何创建一个解析器规则来解析任何东西直到找到']'? I currently have the following production: 我目前有以下产品:
control
:
| Print expr
| Print LeftBracket printArg RightBracket
;
the problem I am facing is that the left bracket does not always start a string. 我面临的问题是左括号并不总是启动一个字符串。 Sometimes (eg in while) the condition is also in brackets. 有时(例如在while中)条件也在括号中。 I thought about just accepting every Lexer rule until the RightBracket is reached and then generate the string at runtime when I use the generated parse tree, but seems to me very annoying and I would need to insert the whitespaces later on which will be difficult. 我想到只接受每个Lexer规则,直到到达RightBracket,然后在我使用生成的解析树时在运行时生成字符串,但在我看来非常烦人,我需要稍后插入空格,这将是困难的。
If you need more parts of my grammar just ask in a comment and I will provide you with more details Kind regards 如果您需要我语法的更多部分,请在评论中提出,我会向您提供更多详细信息亲切的问候
Lukas 卢卡斯
EDIT: more information about my grammar: The following production use brackets: 编辑:有关我的语法的更多信息:以下生产使用括号:
Print LeftBracket printArg RightBracket
Repeat IntegerConstant LeftBracket body RightBracket
While LeftBracket expr RightBracket LeftBracket body RightBracket
If expr LeftBracket body RightBracket LeftBracket body RightBracket
SetPos LeftBracket IntegerConstant IntegerConstant RightBracket
EDIT2: So I tried to use the modes but I got problems on returning from them. EDIT2:所以我尝试使用这些模式,但是我从它们返回时遇到了问题。 These are the code lines I have regarding the modes: 这些是关于模式的代码行:
mode printMode;
WhitespacePrint
: [ \t]+
-> skip
;
LeftBracketPrint : '[' -> popMode, pushMode(stringMode);
NotLeftBracket : ~'[' -> popMode;
mode stringMode;
String : ~']'+;
RightBracketPrint: ']' -> popMode;
And I added a pushMode(printMode) on the Print lexer rule (matches the keyword) Now parsing print [ 1 + 2] creates a single token containing the whole string inside the brackets. 我在Print lexer规则上添加了一个pushMode(printMode)(匹配关键字)现在解析print [1 + 2]创建一个包含括号内整个字符串的令牌。 Now when I use print 1 + 2 (which should output 3), I get a no viable alternative ar input 'print1' exception, since the '1' has a type of NotLeftBracket. 现在当我使用print 1 + 2(应该输出3)时,我得到一个没有可行的替代ar输入'print1'异常,因为'1'有一种NotLeftBracket类型。 How can I switch the mode without consuming the input? 如何在不消耗输入的情况下切换模式?
EDIT3: Next I tried to use some inline code and use lookahead which finally solved my problem: 编辑3:接下来我尝试使用一些内联代码并使用lookahead,最终解决了我的问题:
mode printMode;
LeftBracketPrint : [ \t]+ '[' -> popMode, pushMode(stringMode);
WhitespacePrint
: [ \t]+ {_input.La(1) != '['}?
-> skip, popMode
;
mode stringMode;
String : ~']'+;
RightBracketPrint: ']' -> popMode;
I would start by treating everything inside brackets as a BracketLiteral
in the lexer. 我首先将括号内的所有内容都视为词法分析器中的BracketLiteral
。
LeftBracket : '[' -> pushMode(BracketLiteralMode);
mode BracketLiteralMode;
BracketLiteral : ~']'+;
RightBracket : ']' -> popMode;
Before determining how the special cases would be handled, I would then enumerate every last possibility for where an exception to the BracketLiteral
rule could appear in the grammar. 在确定如何处理特殊情况之前,我将列举BracketLiteral
规则的异常在语法中出现的最后可能性。 If you can add those details, I would be able to make some suggestions regarding how to handle those cases. 如果您可以添加这些详细信息,我将能够就如何处理这些案例提出一些建议。
If I understand correctly there is duality in interpreting bracketed content, it is either string or expression depending on the context (for print it is a string). 如果我理解正确,在解释括号内容时存在二元性,它可以是字符串或表达式,具体取决于上下文(对于打印,它是一个字符串)。
2 possible scenarious: 2可能的风景:
I think the first approach is easier, because in second you have to write parsing rules for the content of print, and this could be not parsable: 我认为第一种方法更容易,因为在第二种方法中你必须为print的内容编写解析规则,这可能是不可解析的:
print [ a ++++ 2 ]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.