简体   繁体   English

ANTLR4 如何创建允许除两个选定字符之外的所有字符的正则表达式?

[英]ANTLR4 How can I create a regular expression that allows all characters except two selected ones?

Hi for example I have this code for the g4 file:嗨,例如,我有 g4 文件的代码:

a: [A-Z][A-Z];
b: [a-z]'3';

Now I want to add one line more, which recognizes all characters that do not belong to a or b现在我想再添加一行,识别所有不属于 a 或 b 的字符

I tried:我试过了:

a: [A-Z][A-Z];
b: [a-z]'3';
ALLOTHERCHARACTERS: ~[a]|~[b]

But i didn´t work.但我没有工作。

For example the input 84209ddjio29 should now be in ALLOTHERCARACTERS, but i didn ´t work.例如,输入 84209ddjio29 现在应该在 ALLOTHERCARACTERS 中,但我没有工作。

(The Lexer gives at the end a java file, but I think this is not important to know, for this "task") (Lexer 最后给出了一个 java 文件,但我认为对于这个“任务”来说,知道这并不重要)

There are many things going wrong here: inside parser rules, you cannot use character sets.这里有很多问题:在解析器规则中,你不能使用字符集。 So a: [AZ][AZ];所以a: [AZ][AZ]; is not possible.不可能。 Only a lexer rule can use character sets, so A: [AZ][AZ];只有词法分析器规则可以使用字符集,所以A: [AZ][AZ]; is valid.已验证。

So, to define a valid (lexer) grammar, you'd need to do this:因此,要定义一个有效的(词法分析器)语法,您需要这样做:

A : [A-Z] [A-Z];
B : [a-z] '3';

Now for your second problem: how to negate rules A and B ?现在你的第二个问题:如何否定规则AB Answer: you cannot.答:你不能。 You can only negate single characters.您只能否定单个字符。 So negating A: [AZ];所以否定A: [AZ]; would be NA: ~[AZ];将是NA: ~[AZ]; (or NA: ~A; is also valid). (或NA: ~A;也是有效的)。 But you cannot negate a rule that matches 2 characters like A: [AZ] [AZ];但是您不能否定匹配 2 个字符的规则,例如A: [AZ] [AZ]; . .

If you want a rule that matches anything other than upper case letters, lower case letters and the digit 3, then you can so this:如果你想要一个匹配除大写字母、小写字母和数字 3 之外的任何东西的规则,那么你可以这样做:

ALLOTHERCHARACTERS : ~[A-Za-z3];

This is the proper syntax for "anything except":这是“除此之外的任何内容”的正确语法:

[^ab]

so that will match any character that is not a or b.所以这将匹配任何不是 a 或 b 的字符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM