简体   繁体   English

ANTLR 努力解析 integer 与引用的字符串

[英]ANTLR struggling to parse integer vs quoted string

I'm trying to create a language using ANTLR where each line consists of an instruction, where an instruction is an opcode and any number of operands like so:我正在尝试使用 ANTLR 创建一种语言,其中每一行由一条指令组成,其中一条指令是一个操作码和任意数量的操作数,如下所示:

aaa "str1" "str2" 123
bbb 123 "str" 456
ccc
ddd

I have strings seemingly working OK, but integers seem to be parsed incorrectly.我的字符串看起来工作正常,但整数似乎解析不正确。

Here's my complete grammar file:这是我完整的语法文件:

grammar Insn;

prog: (line? NEWLINE)+;

line: instruction;
instruction: instruction_name instruction_operands?;

instruction_name: IDENTIFIER;
instruction_operands: instruction_operand instruction_operand*;
instruction_operand: ' '+ (operand_int | operand_string);

operand_int: INT;
operand_string: QSTRING;

NEWLINE : [\r\n]+;
IDENTIFIER: [a-zA-Z0-9_\-]+;
INT: '-'?[0-9]+;
QSTRING: '"' (~('"' | '\\' | '\r' | '\n') | '\\' ('"' | '\\'))* '"';
COMMENT: ';' ~[\r\n]* -> channel(HIDDEN);

I've tried multiple different INT definitions such as INT: '-'?('0'..'9')+;我尝试了多种不同的 INT 定义,例如INT: '-'?('0'..'9')+; and INT: '2';INT: '2'; making all the INTs in the input 2 , always resulting in an error similar to line 1:18 extraneous input '123' expecting {' ', INT, QSTRING} , with the line number, column and 123 integer replaced with whatever it was parsing.在输入2中制作所有 INT,总是导致类似于line 1:18 extraneous input '123' expecting {' ', INT, QSTRING}的错误,其中行号、列和123 integer 替换为它正在解析的任何内容.

Here's the parse tree generated by ANTLR's tooling as used in the ANTLR getting-started.md document.这是由 ANTLR 工具生成的解析树,在 ANTLR getting-started.md 文档中使用。 解析树

I'm completely new to ANTLR and am not familiar with lots of terminology so please keep it simple for me.我对 ANTLR 完全陌生,不熟悉很多术语,所以请保持简单。

The problem is that 123 is recognised as IDENTIFIER because it is a valid identifier (all INT s are).问题是123被识别为IDENTIFIER因为它是一个有效的标识符(所有INT都是)。 Both of them must be distinguishable.两者必须是可区分的。 IDENTIFIER should probably be something like this IDENTIFIER: [a-zA-Z][a-zA-Z0-9_\-]*; IDENTIFIER应该是这样IDENTIFIER: [a-zA-Z][a-zA-Z0-9_\-]*;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM