简体   繁体   中英

Recursion in parser combinator running out of stack space

I've been creating a simple parser combinator library in Java and for a first attempt I'm using programatic strcutures to define both the tokens and the parser grammar, see below:

final Combinator c2 = new CombinatorBuilder()
    /*
    .addParser("SEXPRESSION", of(Option.of(new Terminal("LPAREN"), zeroOrMore(new ParserPlaceholder("EXPRESSION")), new Terminal("RPAREN"))))
    .addParser("QEXPRESSION", of(Option.of(new Terminal("LBRACE"), zeroOrMore(new ParserPlaceholder("EXPRESSION")), new Terminal("RBRACE"))))
    */
    .addParser("SEXPRESSION", of(Option.of(new Terminal("LPAREN"), new ParserPlaceholder("EXPRESSION"), new Terminal("RPAREN"))))
    .addParser("QEXPRESSION", of(Option.of(new Terminal("LBRACE"), new ParserPlaceholder("EXPRESSION"), new Terminal("RBRACE"))))
    .addParser("EXPRESSION", of(
        Option.of(new Terminal("NUMBER")),
        Option.of(new Terminal("SYMBOL")),
        Option.of(new Terminal("STRING")),
        Option.of(new Terminal("COMMENT")),
        Option.of(new ParserPlaceholder("SEXPRESSION")),
        Option.of(new ParserPlaceholder("QEXPRESSION"))
)).build()

If I take the first Parser "SEXPRESSION" defined using the builer I can explain the structure:

Parameters to addParser:

  1. Name of parser
  2. an ImmutableList of disjunctive Option s

Parameters to Option.of:

  1. An array of Element s where each element is either a Terminal , or a ParserPlaceholder which is later substituted for the actual Parser where the names match.

The idea is to be able to reference one Parser from another and thus have more complex grammars expressed.

The problem I'm having is that using the grammar above to parse a string value such as "(+ 1 2)" gets stuck in an infinite recursive call when parsing the RPAREN ')' as the "SEXPRESSIONS" and "EXPRESSION" Parsers have "one or many" cardinaltiy.

I'm sure I could get creative and come up with some way of limiting the depth of the recursive calls, perhaps by ensuring that when the "SEXPRESSION" parser hands off to the "EXPRESSION" parser which then hands off to the "SEXPRESSION" parser, and no token are taken, then drop out? But I don't want a hacky solution if a standard solution exists.

Any ideas?

Thanks

Not to dodge the question, but I don't think there's anything wrong with calling an application using VM arguments to increase stack size.

This can be done in Java by adding the flag -XssNm where N is the amount of memory the application is called with.

The default Java stack size is 512 KB which, frankly, is hardly any memory at all. Minor optimizations aside, I felt that it was difficult, if not impossible to work with that little memory to implement complex recursive solutions, especially because Java isn't the least bit efficient when it comes to recursion.

So, some examples of this flag, as as follows:

  • -Xss4M 4 MB
  • -Xss2G 2 GB

It also goes right after you call java to launch the application, or if you are using an IDE like Eclipse, you can go in and manually set the command line arguments in run configurations.

Hope this helps!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM