简体   繁体   English

GDL Antlr语法

[英]GDL Antlr grammar

I need a parser for Game Description Language (GDL) in Java 我需要Java中的游戏描述语言(GDL)解析器

For this I am currently trying to use ANTLR4. 为此,我目前正在尝试使用ANTLR4。

my current grammar given in the following does seem to be not correct or at least the generated parser does not recognize a game description which i will also provide below. 我在下面给出的当前语法似乎不正确,或者至少生成的解析器无法识别游戏描述,我还将在下面提供。

The ANTLR4-Grammar: ANTLR4-语法:

grammar GDL;

description :  (gdlRule | sentence)+ ;

gdlRule : '(' SP? '<=' SP? sentence (SP literal)* SP? ')';

sentence : propLit | ( '(' relLit ')' );

literal : ( '(' SP? (orLit | notLit | distinctLit | relLit) SP? ')' ) 
| ( '('  (orLit | notLit | distinctLit | relLit) ')' ) 
| propLit;
notLit : 'not' SP literal | '~' literal;
orLit : 'or' SP literal* ;
distinctLit : 'distinct' SP term SP term;
propLit : constant;
relLit : constant (SP term)+;

term : ( '(' funcTerm ')' ) | varTerm | constTerm;
funcTerm : constant (SP term)*;
varTerm : '?' constant;
constTerm : constant;


constant : ident | number;
/* ident is any string of letters, digits, and underscores */
ident: ID;
number: NR;
NR : [0-9]+;
ID : [a-zA-Z] [a-zA-Z0-9]* ;
SP : ' '+;

COMMENT : ';'[A-Za-z0-9; \r\t]*'\n' -> skip;
WS : [ ;\t\r\n]+ -> skip
;

The game description given in GDL: GDL中给出的游戏说明:

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; Tictictoe
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

  (role white)
  (role black)

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

  (init (cell 1 1 b))
  (init (cell 1 2 b))
  (init (cell 1 3 b))
  (init (cell 2 1 b))
  (init (cell 2 2 b))
  (init (cell 2 3 b))
  (init (cell 3 1 b))
  (init (cell 3 2 b))
  (init (cell 3 3 b))
  (init (step 1))

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

  (<= (next (cell ?j ?k x))
      (true (cell ?j ?k b))
      (does white (mark ?j ?k))
      (does black (mark ?m ?n))
      (or (distinct ?j ?m) (distinct ?k ?n)))

  (<= (next (cell ?m ?n o))
      (true (cell ?m ?n b))
      (does white (mark ?j ?k))
      (does black (mark ?m ?n))
      (or (distinct ?j ?m) (distinct ?k ?n)))

  (<= (next (cell ?m ?n b))
      (true (cell ?m ?n b))
      (does white (mark ?m ?n))
      (does black (mark ?m ?n)))

  (<= (next (cell ?p ?q b))
      (true (cell ?p ?q b))
      (does white (mark ?j ?k))
      (does black (mark ?m ?n))
      (or (distinct ?j ?p) (distinct ?k ?q))
      (or (distinct ?m ?p) (distinct ?n ?q)))

  (<= (next (cell ?m ?n ?w))
      (true (cell ?m ?n ?w))
      (distinct ?w b))


  (<= (next (step ?y))
      (true (step ?x))
      (succ ?x ?y))


  (succ 1 2)
  (succ 2 3)
  (succ 3 4)
  (succ 4 5)
  (succ 5 6)
  (succ 6 7)


  (<= (row ?m ?x)
      (true (cell ?m 1 ?x))
      (true (cell ?m 2 ?x))
      (true (cell ?m 3 ?x)))

  (<= (column ?n ?x)
      (true (cell 1 ?n ?x))
      (true (cell 2 ?n ?x))
      (true (cell 3 ?n ?x)))

  (<= (diagonal ?x)
      (true (cell 1 1 ?x))
      (true (cell 2 2 ?x))
      (true (cell 3 3 ?x)))

  (<= (diagonal ?x)
      (true (cell 1 3 ?x))
      (true (cell 2 2 ?x))
      (true (cell 3 1 ?x)))

  (<= (line ?x) (row ?m ?x))
  (<= (line ?x) (column ?m ?x))
  (<= (line ?x) (diagonal ?x))


  (<= nolinex
      (not (line x)))
  (<= nolineo
      (not (line o)))

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

  (<= (legal white (mark ?x ?y))
      (true (cell ?x ?y b)))

  (<= (legal black (mark ?x ?y))
      (true (cell ?x ?y b)))

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

  (<= (goal white 50)
      (line x)
      (line o))

  (<= (goal white 100)
      (line x)
      nolineo)

  (<= (goal white 0)
      nolinex
      (line o))

  (<= (goal white 50)
      nolinex
      nolineo)

  (<= (goal black 50)
      (line x)
      (line o))

  (<= (goal black 100)
      nolinex
      (line o))

  (<= (goal black 0)
      (line x)
      nolineo)

  (<= (goal black 50)
      nolinex
      nolineo)

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

  (<= terminal
      (true (step 7)))

  (<= terminal
      (line x))

  (<= terminal
      (line o))

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

The error output of the generated parser: 生成的解析器的错误输出:

line 24:6 mismatched input '(' expecting {')', SP}
line 27:7 no viable alternative at input '(or'

I don't know what i have to change or how to get a correct grammar 我不知道我需要更改什么或如何获得正确的语法

Any help would be appreciated 任何帮助,将不胜感激

The problem is your handling of whitespace. 问题是您对空白的处理。

You have two rules, one of which creates a token: 您有两个规则,其中一个创建令牌:

SP : ' '+;

and the other one which simply ignores the whitespace: 另一个简单地忽略空格:

WS : [ ;\t\r\n]+ -> skip

If the whitespace starts with a space character, the first rule will apply and you will get a SP token. 如果空格以空格字符开头,则将应用第一个规则,您将获得一个SP令牌。 If the whitespace starts with a newline or some other character listed in the WS rule, the entire run of whitespace will be ignored. 如果空格以换行符或WS规则中列出的其他字符开头,则整个空格将被忽略。

Since your grammar insists on SP tokens at certain points, the ignored whitespace will cause a syntax error. 由于您的语法在某些点上坚持使用SP令牌,因此忽略的空格将导致语法错误。

There is no reason that I can see to complicate your grammar with explicit whitespace. 我没有理由使您的语法与显式空白复杂化。 I would get rid of SP , remove all references to it in your grammar, and just let WS ignore whitespace. 我将摆脱SP ,删除语法中对它的所有引用,然后让WS忽略空格。

I would also remove the semicolon from WS to avoid interactions with COMMENT . 我还将从WS删除分号,以避免与COMMENT进行交互。 [Note 1] And I would simplify COMMENT so that it just ignores frim a semicolon to the end of the line, rather than gaving a list of valid comment characters. [注1]并且我将简化COMMENT以便它只忽略分号到行尾,而不是给出有效注释字符的列表。 (What if you want to put a comma or a * in a comment?) (如果要在评论中添加逗号或*怎么办?)


Notes 笔记

  1. You would see this problem if there were a newline at the beginning of the file, with the row of semicolons at line 2. Then COMMENT does not match at the first character, but WS does. 如果文件开头有换行符,并且分号行在第2行,您将看到此问题。然后COMMENT在第一个字符处不匹配,但是WS匹配。 WS will then match (and ignore) the newline, the row of semicolons, the next newline, the semicolons at the beginning of the next line, and the following space, leaving Tictictoe to be scanned as an ID , which will cause a parse error. 然后, WS将匹配(并忽略)换行符,分号行,下一个换行符,下一行的开始处的分号以及以下空格,从而使Tictictoe作为ID进行扫描,这将导致解析错误。

    You would also see it if any other comment were something other than a row of semicolons. 如果还有其他注释不是一排分号,您也会看到它。 These are currently being scanned as WS , starring with the newline before the comment. 目前,这些文件正在以WS进行扫描,并在评论之前以换行符开头。 That happens to be ok, since the comment only includes semicolons. 没关系,因为注释仅包含分号。 But any other non-whitespace character would terminate the WS and then be unexpectedly parsed as program text. 但是任何其他非空白字符都会终止WS ,然后意外地将其解析为程序文本。

(At least) 3 things are incorrect: (至少)3件事不正确:

  • you include ; 你包括; in your WS rule and its the start of your COMMENT 在您的WS规则中以及它的COMMENT开头
  • your COMMENT rule says it needs to end with a line break. 您的COMMENT规则说它需要以换行符结尾。 However, line breaks are already included in the WS rule, and it would disallow comments that end with EOF (without a line break) 但是,换行符已经包含在WS规则中,并且它不允许以EOF结尾的注释(没有换行符)
  • SP is not needed: spaces need to be skipped and not included in your parser rules 不需要SP :需要跳过空格并且不包含在解析器规则中

Try something like this instead: 尝试这样的事情:

grammar GDL;

description :  (gdlRule | sentence)+ ;

gdlRule : '(' '<=' sentence literal* ')';

sentence : propLit | ( '(' relLit ')' );

literal 
 : ( '(' (orLit | notLit | distinctLit | relLit) ')' )
 | ( '('  (orLit | notLit | distinctLit | relLit) ')' )
 | propLit
 ;

notLit : 'not' literal | '~' literal;
orLit : 'or' literal* ;
distinctLit : 'distinct' term term;
propLit : constant;
relLit : constant (term)+;
term : ( '(' funcTerm ')' ) | varTerm | constTerm;
funcTerm : constant (term)*;
varTerm : '?' constant;
constTerm : constant;
constant : ident | number;
ident: ID;
number: NR;

NR : [0-9]+;
ID : [a-zA-Z] [a-zA-Z0-9]*;
COMMENT : ';'[A-Za-z0-9; \r\t]* -> skip;
WS : [ \t\r\n]+ -> skip;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM