简体   繁体   中英

grammar-free section in javaCC

Here is a short javaCC code:

PARSER_BEGIN(TestParser)

public class TestParser
    {
    }

PARSER_END(TestParser)

SKIP :
    {
    " "
    | "\t"
    | "\n"
    | "\r"
    }

TOKEN : /* LITERALS */
{
  <VOID: "void">
| <LPAR: "("> | <RPAR: ")">
| <LBRAC: "{"> | <RBRAC: "}">
| <COMMA: ",">
| <DATATYPE: "int">
| <#LETTER: ["_","a"-"z","A"-"Z"] >
| <#DIGIT: ["0"-"9"] >
| <DOUBLE_QUOTE_LITERAL: "\"" (~["\""])*"\"" >
| <IDENTIFIER: <LETTER> (<LETTER>|<DIGIT>)* >
| <VARIABLE: "$"<IDENTIFIER> >
}

public void input():{} { (statement())+ <EOF> }
private void statement():{}
    {
    <VOID> <IDENTIFIER> <LPAR> (<DATATYPE> <IDENTIFIER> (<COMMA> <DATATYPE> <IDENTIFIER>)*)? <RPAR>
        <LBRAC>

        <RBRAC>
    }

I'd like this parser to handle the following kind of input with a "grammar-free" section (character '}' would be the end of the section ):

void fun(int i, int j)
 {
 Hello world the value of i is ${i} 
  and j=${j}.
 }

the grammar-free section would return a

java.util.List<String_or_VariableReference>

How should I modify my javacc parser to handle this section ?

Thanks.

If I understand the question correctly, you want to allow essentially arbitrary input for a while and then switch back to your language. If you can decide when to make the switch based purely on tokens, then this is easy to do using two lexical states. Use the default state for your programming language. When a "{" is seen in the DEFAULT state, switch to the other state

TOKEN: { <LBRACE : "{" > : FREE } 

In the FREE state, when a "}" is seen, switch back to the DEFAULT state; when any other character is seen, pass it on to the parser.

<FREE> TOKEN { <RBRACE : "}" > : DEFAULT }
<FREE> TOKEN { <OTHER : ~["}"] > : FREE }

In the parser you can have

void freeSection() : {} { <LBRACE> (<OTHER>)* <RBRACE> }

If you want to do something with all those OTHER characters, see question 5.2 in the FAQ. http://www.engr.mun.ca/~theo/JavaCC-FAQ

If you want to capture variable references such as "${i}" in the FREE state, you can to that too. Add

<FREE> TOKEN { <VARREF : "${" (["a"-"Z"]|["A"-"Z"])* "}" > } 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM