简体   繁体   English

LOOKAHEADs用于JavaScript / ECMAScript数组文字制作

[英]LOOKAHEADs for the JavaScript/ECMAScript array literal production

I currently implementing a JavaScript/ ECMAScript 5.1 parser with JavaCC and have problems with the ArrayLiteral production. 我目前正在使用JavaCC实现JavaScript / ECMAScript 5.1解析器,并且在ArrayLiteral生产方面存在问题。

ArrayLiteral :
    [ Elision_opt ]
    [ ElementList ]
    [ ElementList , Elision_opt ]

ElementList :
    Elision_opt AssignmentExpression
    ElementList , Elision_opt AssignmentExpression

Elision :
    ,
    Elision ,

I have three questions, I'll ask them one by one. 我有三个问题,我会逐一问他们。

This is the second one. 这是第二个。


I have simplified this production to the following form: 我已将此制作简化为以下形式:

ArrayLiteral:
    "[" ("," | AssignmentExpression ",") * AssignmentExpression ? "]"

Please see the first question on whether it is correct or not: 请查看第一个问题是否正确:

How to simplify JavaScript/ECMAScript array literal production? 如何简化JavaScript / ECMAScript数组文字生成?

Now I have tried to implement it in JavaCC as follows: 现在我尝试在JavaCC中实现它如下:

void ArrayLiteral() :
{
}
{
    "["
    (
        ","
    |   AssignmentExpression()
        ","
    ) *
    (
        AssignmentExpression()
    ) ?
    "]"
}

JavaCC complains about ambiguous , or AssignmentExpression (its contents). JavaCC的抱怨含糊不清,AssignmentExpression (内容)。 Obviously, a LOOKAHEAD specification is required. 显然,需要LOOKAHEAD规范。 I have spent a lot of time trying to figure the LOOKAHEAD s out, tried different things like 我花了很多时间试图把LOOKAHEAD想出去,尝试了不同的东西

  • LOOKAHEAD (AssignmentExpression() ",") in (...)* (...)* LOOKAHEAD (AssignmentExpression() ",")
  • LOOKAHEAD (AssignmentExpression() "]") in (...)? (...)? LOOKAHEAD (AssignmentExpression() "]") (...)?

and a few other variations, but I could not get rid of the JavaCC warning. 和其他一些变化,但我无法摆脱JavaCC警告。

I fail to understand why this does not work: 我不明白为什么这不起作用:

void ArrayLiteral() :
{
}
{
    "["
    (
        LOOKAHEAD ("," | AssignmentExpression() ",")
        ","
    |   AssignmentExpression()
        ","
    ) *
    (
        LOOKAHEAD (AssignmentExpression() "]")
        AssignmentExpression()
    ) ?
    "]"
}

Ok, AssignmentExpression() per se is ambiguous, but the trailing "," or "]" in LOOKAHEAD s should make it clear which of the choices should be taken - or am I mistaken here? 好的, AssignmentExpression()本身是不明确的,但LOOKAHEAD的尾随",""]"应该清楚应该采取哪些选择 - 或者我在这里弄错了?

What would a correct LOOKAHEAD specification for this production look like? 这个产品的正确LOOKAHEAD规格是什么样的?

Update 更新

This did not work, unfortunately: 不幸的是,这不起作用:

void ArrayLiteral() :
{
}
{
    "["
    (
        ","
    |
        LOOKAHEAD (AssignmentExpression() ",")
        AssignmentExpression()
        ","
    ) *
    (
        AssignmentExpression()
    ) ?
    "]"
}

Warning: 警告:

Warning: Choice conflict in (...)* construct at line 6, column 5.
         Expansion nested within construct and expansion following construct
         have common prefixes, one of which is: "function"
         Consider using a lookahead of 2 or more for nested expansion.

Line 6 is ( before the first LOOKAHEAD . The common prefix "function" is simply one of the possible starts of AssignmentExpression . 第6行是(在第一个LOOKAHEAD之前。公共前缀"function"只是AssignmentExpression的可能启动之一。

JavaCC produces top-down parsers. JavaCC生成自上而下的解析器。 I'll say off the top that I'm not a fan of top-down parser generators, so I'm not a JavaCC expert and I don't have it handy to test. 我会说我不喜欢自上而下的解析器生成器,所以我不是JavaCC专家,我没有方便测试。

( Edit: I thought something else would work, but I realized afterwards that I don't understand how JavaCC attaches lookahead to actually choices; in the case of ( A | B )* C , there are actually three possible choices: A, B and C. I thought it would consider all three of them, but it's possible that it does them two at a time. So the following is yet another guess.) 编辑:我认为其他的东西会起作用,但之后我意识到我不明白JavaCC如何附加实际选择;在( A | B )* C的情况下,实际上有三种可能的选择:A,B我认为它会考虑所有这三个,但它可能一次做两个。所以以下是另一个猜测。)

Having said that, I think the following would work, but it involves parsing just about every AssignmentExpression() twice. 话虽如此,我认为以下内容可行,但它涉及解析几乎每个AssignmentExpression()两次。

{
    "["
    (
        ","
    |
        AssignmentExpression()
        ","
    ) *
    (
        LOOKAHEAD (AssignmentExpression() "]")
        AssignmentExpression()
    ) ?
    "]"
}

As I indicated in the linked question , a better solution is to rewrite the production differently: 正如我在链接问题中所指出 ,更好的解决方案是以不同方式重写产品:

"[" AssignmentExpression ? ("," AssignmentExpression ?) * "]"

That leads to a one-token lookahead grammar, so you won't need the LOOKAHEAD declaration to handle it. 这导致了一个令牌前瞻语法,所以你不需要LOOKAHEAD声明来处理它。

Here is yet another approach. 这是另一种方法。 It has the advantage of identifying which commas indicate an undefined elements without using any semantic actions. 它具有识别哪些逗号表示未定义元素而不使用任何语义动作的优点。

void ArrayLiteral() : {} { "[" MoreArrayLiteral() }

void MoreArrayLiteral() : {} {
    "]"
|    "," /* undefined item */ MoreArrayLiteral()
|    AssignmentExpression() ( "]" |  "," MoreArrayLiteral() )
}

This is how I solved it (thanks to the answer by @rici ): 这就是我解决它的方式(感谢@rici的回答):

JSArrayLiteral ArrayLiteral() : 
{
    boolean lastElementWasAssignmentExpression = false;
}
{
    "["
    (
        (
            AssignmentExpression()
            {
                // Do something with expression
                lastElementWasAssignmentExpression = true;
            }
        ) ?
        (
            ","
            {
                if (!lastElementWasAssignmentExpression)
                {
                    // Do something with elision
                }
            }
            (
                AssignmentExpression()
                {
                    // Do something with expression
                    lastElementWasAssignmentExpression = true;
                }
            ) ?
        ) *
    )
    "]"
    {
        // Do something with results
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM