简体   繁体   English

ANTLR4:忽略输入中的空格,但不忽略字符串文字中的空格

[英]ANTLR4: ignore white spaces in the input but not those in string literals

I have a simple grammar as follows: 我有一个简单的语法如下:

grammar SampleConfig;

line: ID (WS)* '=' (WS)* string;

ID: [a-zA-Z]+;
string: '"' (ESC|.)*? '"' ;
ESC : '\\"' | '\\\\' ; // 2-char sequences \" and \\
WS: [ \t]+ -> skip;

The spaces in the input are completely ignored, including those in the string literal. 输入中的空格被完全忽略,包括字符串文字中的空格。

final String input = "key = \"value with spaces in between\"";
final SampleConfigLexer l = new SampleConfigLexer(new ANTLRInputStream(input));
final SampleConfigParser p = new SampleConfigParser(new CommonTokenStream(l));
final LineContext context = p.line();
System.out.println(context.getChildCount() + ": " + context.getText());

This prints the following output: 这将打印以下输出:

3: key="valuewithspacesinbetween"

But, I expected the white spaces in the string literal to be retained, ie 但是,我希望保留字符串文字中的空格,即

3: key="value with spaces in between"

Is it possible to correct the grammar to achieve this behavior or should I just override CommonTokenStream to ignore whitespace during the parsing process? 是否可以更正语法来实现此行为,还是应该覆盖CommonTokenStream以在解析过程中忽略空格?

You shouldn't expect any spaces in parser rules since you're skipping them in your lexer. 您不应该期望解析器规则中有任何空格,因为您在词法分析器中跳过它们。

Either remove the skip command or make string a lexer rule: 删除skip命令或使string成为词法分析器规则:

STRING : '"' ( '\\' [\\"] | ~[\\"\r\n] )* '"';

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM