简体   繁体   中英

Parsing comment in Rascal

I have a very basic question about parsing a fragment that contains comment. First we import my favorite language, Pico:

import lang::pico::\syntax::Main;

Then we execute the following:

 parse(#Id,"a");

gives, as expected:

 Id: (Id) `a`

However,

parse(#Id,"a\n%% some comment\n");

gives a parse error.

What do I do wrong here?

There are multiple problems.

  1. Id is a lexical, meaning layout (comments) are never there
  2. Layout is only inserted between elements in a production and the Id lexical has only a character class, so no place to insert layout.
  3. Even if Id was a syntax non terminal with multiple elements, it would parse comments between them not before or after.

For more on the difference between syntax , lexical , and layout see: Rascal Syntax Definitions .

If you want to parse comments around a non terminal, we have the start modified for the non terminal. Normally, layout is only inserted between elements in the production, with start it is also inserted before and after it.

Example take this grammer:

layout L = [\t\ ]* !>> [\t\ ];
lexical AB = "A" "B"+;
syntax CD = "C" "D"+;
start syntax EF = "E" "F"+;

this will be transformed into this grammar:

AB   = "A" "B"+;
CD'  = "C" L "D"+;
EF'  = L "E" L "F"+ L;
"B"+ = "B"+ "B" | "B";
"D"+ = "D"+ L "D" | "D";
"F"+ = "F"+ L "F" | "F";

So, in particular if you'd want to parse a string with layout around it, you could write this:

lexical Id = [a-z]+;
start syntax P = Id i;
layout L = [\ \n\t]*;

parse(#start[P], "\naap\n").top // parses and returns the P node
parse(#start[P], "\naap\n").top.i // parses and returns the Id node
parse(P, "\naap"); // parse error at 0 because start wrapper is not around P

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM