简体   繁体   中英

How to get Coco/R parser to not be greedy

My ATG file defines a code block as

Codeblock = "<#" {anychar} "#>"

When the Coco generated parser comes across a block like this:

<#
   a=5;
   print "Hello world!";
#>

The token picks up

a=5;
print "Hello

This is exactly what I want.

However, when it comes across code like this:

<#
   a=5;
   print "Hello World";
#>
<#
   b=5;
   print "Foo Bar";
#>

The token, greedily picks up

 a=5;
 print "Hello World";
 #>
 <#
   b=5;
   print "Foo Bar";

How can I let Coco/R know not to do this?

try this:

codeblock = "<#" {anychar} "#>" .
anychar = (expression|procedure) ";" .

by making anychar ended with ";" then cocor cannot mistakenly parse anychar with this pattern "#> <#"

Your terminals need to be more explicit.

"ANY" introduces ambiguity which is why the #><# is being parsed, your codeblock will treat everything between the FIRST <# and LAST #> as being part of the set "ANY" since that is how your grammar has defined a codeblock.

Perhaps try:

code = codeblock {codeblock} EOF
codeblock = "<#" {anychar} "#>"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM