简体   繁体   English

将LeftBracket之后的所有内容解释为字符串直到下一个RightBracket

[英]interpret everything after LeftBracket as a string until next RightBracket

[Solution] on the bottom under Edit3 [解决方案]在Edit3下面的底部

I am currently developring a new grammar (from certain requirements which I cannot change) and the following requirement poses a problem, I cannot solve at the moment. 我目前正在开发一种新的语法(从我无法改变的某些要求),以下要求带来了一个问题,我现在无法解决。 I am using Antlr4 with the C# target. 我正在使用Antlr4和C#目标。

The syntax is as follows: 语法如下:

print [blabla ]

so everything inside the brackets is considered a string. 因此括号内的所有内容都被视为字符串。 So also this: 这也是这样的:

print [3 + 2]

will print 将打印

3 + 2

Now I have lexer rules which will obviously match the 3 as an Integer. 现在我有lexer规则,显然将3作为整数匹配。 So how can I create a parser rule which will parse anything until a ']' is found? 那么如何创建一个解析器规则来解析任何东西直到找到']'? I currently have the following production: 我目前有以下产品:

control
: 
| Print expr
| Print LeftBracket printArg RightBracket
    ;

the problem I am facing is that the left bracket does not always start a string. 我面临的问题是左括号并不总是启动一个字符串。 Sometimes (eg in while) the condition is also in brackets. 有时(例如在while中)条件也在括号中。 I thought about just accepting every Lexer rule until the RightBracket is reached and then generate the string at runtime when I use the generated parse tree, but seems to me very annoying and I would need to insert the whitespaces later on which will be difficult. 我想到只接受每个Lexer规则,直到到达RightBracket,然后在我使用生成的解析树时在运行时生成字符串,但在我看来非常烦人,我需要稍后插入空格,这将是困难的。

If you need more parts of my grammar just ask in a comment and I will provide you with more details Kind regards 如果您需要我语法的更多部分,请在评论中提出,我会向您提供更多详细信息亲切的问候

Lukas 卢卡斯

EDIT: more information about my grammar: The following production use brackets: 编辑:有关我的语法的更多信息:以下生产使用括号:

Print LeftBracket printArg RightBracket
Repeat IntegerConstant LeftBracket body RightBracket
While LeftBracket expr RightBracket LeftBracket body RightBracket
If expr LeftBracket body RightBracket LeftBracket body RightBracket
SetPos LeftBracket IntegerConstant IntegerConstant RightBracket

EDIT2: So I tried to use the modes but I got problems on returning from them. EDIT2:所以我尝试使用这些模式,但是我从它们返回时遇到了问题。 These are the code lines I have regarding the modes: 这些是关于模式的代码行:

mode printMode;
WhitespacePrint
    :   [ \t]+
        -> skip
    ;
LeftBracketPrint : '[' -> popMode, pushMode(stringMode);
NotLeftBracket : ~'[' -> popMode;

mode stringMode;
String : ~']'+;
RightBracketPrint: ']' -> popMode;

And I added a pushMode(printMode) on the Print lexer rule (matches the keyword) Now parsing print [ 1 + 2] creates a single token containing the whole string inside the brackets. 我在Print lexer规则上添加了一个pushMode(printMode)(匹配关键字)现在解析print [1 + 2]创建一个包含括号内整个字符串的令牌。 Now when I use print 1 + 2 (which should output 3), I get a no viable alternative ar input 'print1' exception, since the '1' has a type of NotLeftBracket. 现在当我使用print 1 + 2(应该输出3)时,我得到一个没有可行的替代ar输入'print1'异常,因为'1'有一种NotLeftBracket类型。 How can I switch the mode without consuming the input? 如何在不消耗输入的情况下切换模式?

EDIT3: Next I tried to use some inline code and use lookahead which finally solved my problem: 编辑3:接下来我尝试使用一些内联代码并使用lookahead,最终解决了我的问题:

mode printMode;
LeftBracketPrint : [ \t]+ '[' -> popMode, pushMode(stringMode);
WhitespacePrint
    :   [ \t]+ {_input.La(1) != '['}?
        -> skip, popMode
    ;

mode stringMode;
String : ~']'+;
RightBracketPrint: ']' -> popMode;

I would start by treating everything inside brackets as a BracketLiteral in the lexer. 我首先将括号内的所有内容都视为词法分析器中的BracketLiteral

LeftBracket : '[' -> pushMode(BracketLiteralMode);

mode BracketLiteralMode;

  BracketLiteral : ~']'+;
  RightBracket : ']' -> popMode;

Before determining how the special cases would be handled, I would then enumerate every last possibility for where an exception to the BracketLiteral rule could appear in the grammar. 在确定如何处理特殊情况之前,我将列举BracketLiteral规则的异常在语法中出现的最后可能性。 If you can add those details, I would be able to make some suggestions regarding how to handle those cases. 如果您可以添加这些详细信息,我将能够就如何处理这些案例提出一些建议。

If I understand correctly there is duality in interpreting bracketed content, it is either string or expression depending on the context (for print it is a string). 如果我理解正确,在解释括号内容时存在二元性,它可以是字符串或表达式,具体取决于上下文(对于打印,它是一个字符串)。

2 possible scenarious: 2可能的风景:

  • at lexer level check the context when hitting left bracket, and then either go into string mode, or regular mode (ie expression) 在词法分析器级别检查左括号时的上下文,然后进入字符串模式或常规模式(即表达式)
  • also at lexer level create a buffer whenever you hit left bracket and fill it with following text, use the right bracket value (normally it is useless) as the vehicle to pass verbatim string 也可以在词法分析器级别创建一个缓冲区,只要你点击左括号并用下面的文字填充它,使用右括号值(通常它没用)作为通过逐字字符串的车辆

I think the first approach is easier, because in second you have to write parsing rules for the content of print, and this could be not parsable: 我认为第一种方法更容易,因为在第二种方法中你必须为print的内容编写解析规则,这可能是不可解析的:

print [ a ++++ 2 ]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 获取字符串之后的所有文本,直到字符串下次出现? - Get all text after a string until next occurrence of the string? 如何在字符串中搜索文本并抓取搜索文本之后的所有内容,直到它到达一个字符? - How do I search for a text in a string and grab everything after the search text until it reaches a character? 之后检索所有内容? 在查询字符串中 - Retrieving everything after ? in a query string 从字符串中删除所有内容,包括XYZ - Remove everything from string after and including XYZ 拆分包含前导数字的字符串以及之后的所有内容 - Split a string consisting into leading numbers and everything after that 删除字符串中第一个“”之后的所有内容? (空间) - Removing everything after the first " " in a string? (space) 将URL作为字符串传递-'&'之后的所有内容都会丢失 - Passing URL as string — everything after '&' gets lost AvalonEdit突出显示一个单词之后和下一个空格之前的所有内容? - AvalonEdit highlight everything after a word and before the next space? 查找 N 个匹配后的所有字符,直到下一个 N 个匹配 - Find all character after N match until next N match 数据绑定不会更新数据库,直到移动到上一个或下一个记录 - Databinding not updating database until after move to previous or next record
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM