简体   繁体   中英

ANTLR4 parser errors in code when Intellij plugin constructs it correctly

I am attempting to construct a compiler for the golfing language Vyxal and am using ANTLR for parsing. I have managed to write down almost the entire language, and it works perfectly in the Intellij plugin for ANTLR, but when I try running the code using the input ⟨1|2⟩⟨1|2⟩ , it says line 1:10 mismatched input '<EOF>' expecting <a bunch of random characters> . I found this question , but my lexer and parser rules are in the same file, and I have already confirmed that the tokens are correct.

My grammar:

grammar Vyxal;

@header {
package io.github.seggan.jyxal.antlr;
}

file
    : program EOF
    ;

program
    : (literal | structure | element)+
    ;

element
    : PREFIX? element_type
    ;

element_type
    : ALPHA | '<' | ':' | '\u00d7' | '\u1e40' | '\u1e6b' | '\u2087' | '\u00be' | '\u2084' | '\u21b5'
    | '\u00b9' | '\u03a0' | '\u00e6' | '\u1e61' | '\u2211' | '\u1e8e'    | '\u221a' | '\u1e0b' | '\u00a7'
    | '\u00b2' | '\u2026' | '\u1e45' | '\u017b' | '\u01cd' | '-' | '\u2235' | '\u2194' | '\u2260' | '\u027e'
    | '\u00a4' | '\u20b4' | '\u01cf'    | '\u21e7' | '\u0121' | '\u1e8f' | '\u207c' | '\u204b' | '\u2229'
    | '\u2248' | '\u2237' | '\u2088' | '\u00f7' | '\u0227' | '\u0280' | '\u2080' | '\u1e02' | '\u228d'
    | '\u2234'    | '\u2228' | '\u022f' | '\u2070' | '\u1e8a' | '\u21e9' | '\u1e87' | '\u2039' | '\u1e2d'
    | '\u2020' | '\u201f' | '\u2308' | '\u2081' | '!' | '\u20ac' | '\u0188' | '\u01d2'    | '\u027d' | '\u0281'
    | ',' | '\u022e' | '\u22ce' | '\u03c4' | '\u01ce' | '\u1e59' | '%' | '\u1e86' | '\u2227' | '\u21b2'
    | '\u01d0' | '\u00a2' | '\u201e' | '\u0116'    | '\u2082' | '\u1e1e' | '\ua60d' | '}' | '*' | '\u1e8b'
    | '?' | '\u2085' | '\u0140' | '\u00df' | '\u27c7' | '\u2105' | '\u00a5'| '\u2086' | '\u0120' | '\u1e57'
    | '\u221e' | '\u1e56' | '\ua71d' | '\u01d3' | '\u203a' | '\u03b5' | '\u25a1' | '\u1e6a' | '\u00a6'
    | '\u0117' | '$' | '\u1e58' | '\u0130' | '=' | '\u2193' | '\u010b'    | '\u2083' | '\u1e22' | '_' | '\u27d1'
    | '\u010a' | '\u013f' | '\u00ac' | '\u00b6' | '\u00f0' | '\u1e1f' | '\u00a1' | '\u00af' | '\u2265'
    | '\u01d4' | '\u017c' | '\u2191'    | '\u1e0a' | '\u00bc' | '\u22cf' | '\u01d1' | '>' | '\u1e41' | '\u00a3'
    | '\u215b' | '\u1e23' | '+' | '\u00b1' | '/' | '\u21b3' | '\u222a' | '\u2207' | '\u2264'    | '\u1e03'
    | '\u2310' | '^' | '\u1e60' | '\u0226' | '\u03b2' | '\u2022' | '\u00bd' | '\u1e44'
    ;

// structures
structure
    : if_statement
    | for_loop
    | while_loop
    | lambda
    | function
    | variable_assn
    ;

if_statement
    : '[' program ('|' program)? ']'?
    ;

for_loop
    : '(' (variable '|')? program ')'?
    ;

while_loop
    : '{' (program '|')? program '}'?
    ;

lambda
    : LAMBDA_TYPE (integer '|')? program ';'?
    ;

function
    : '@' variable ((':' parameter (':' parameter)*)? '|' program)? ';'?
    ;

variable_assn
    : ASSN_SIGN variable
    ;

variable
    : (ALPHA | DIGIT)+
    ;

parameter
    : '*' | variable | integer
    ;


// types
literal
    : number
    | string
    | list
    ;

string
    : normal_string
    | compressed_string
    | single_char_string
    | double_char_string
    ;

number
    : integer
    | complex
    | compressed_number
    ;

integer
    : DIGIT+ ('.' DIGIT+)?
    ;

complex
    : integer '°' integer
    ;

list
    : '\u27e8' program ('|' program)* '\u27e9'?
    ;

any_text
    : .+?
    ;

compressed_string
    : '\u00ab' any_text '\u00ab'?
    ;

normal_string
    : '`' any_text '`'?
    ;

single_char_string
    : '\\' .
    ;

double_char_string
    : '‛' . .
    ;

compressed_number
    : '\u00bb' any_text '\u00bb'?
    ;

DIGIT
    : [0-9]
    ;


// code
PREFIX
    : [¨Þkø∆]
    ;

ALPHA
    : [a-zA-Z]
    ;

ASSN_SIGN
    : '→' | '←'
    ;

// strucutres
LAMBDA_TYPE
    : [λƛ'µ]
    ;

WHT
    : [ \t\n\r] -> skip
    ;

The code I use is simple:

String s = Files.readString(Path.of(args[0]));
VyxalLexer lexer = new VyxalLexer(CharStreams.fromString(s));
VyxalParser parser = new VyxalParser(new CommonTokenStream(lexer));

The Intellij plugin constructs it properly: ANTLR Intellij 插件输出

And the tokens are of the correct values:

[@-1,0:0='⟨',<163>,1:0]
[@-1,1:1='1',<170>,1:1]
[@-1,2:2='|',<154>,1:2]
[@-1,3:3='2',<170>,1:3]
[@-1,4:4='⟩',<164>,1:4]
[@-1,5:5='⟨',<163>,1:5]
[@-1,6:6='1',<170>,1:6]
[@-1,7:7='|',<154>,1:7]
[@-1,8:8='2',<170>,1:8]
[@-1,9:9='⟩',<164>,1:9]

IntelliJ displays this:

在此处输入图像描述

And when I run:

VyxalLexer lexer = new VyxalLexer(CharStreams.fromString("⟨1|2⟩⟨1|2⟩"));
VyxalParser parser = new VyxalParser(new CommonTokenStream(lexer));
System.out.println(parser.file().toStringTree(parser));

the following is printed:

(file (program (literal (list ⟨ (program (literal (number (integer 1)))) | (program (literal (number (integer 2)))) ⟩)) (literal (list ⟨ (program (literal (number (integer 1)))) | (program (literal (number (integer 2)))) ⟩))) <EOF>)

Which is in sync with what IntelliJ displays.

I'm guessing you haven't recently generated new parser classes. Causing your own Java code to use older parser classes than the IntelliJ plugin is using.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM