简体   繁体   English

如何修复我的 DSL 语法以解析问题陈述?

[英]How can I fix my DSL grammar to parse a problem statement?

I've been tasked with creating a grammar for a legacy DSL that's been in use for over 20 years.我的任务是为已经使用了 20 多年的遗留 DSL 创建语法。 The original parser was written using a mess of regular expressions, so I've been told.原来的解析器是用一堆正则表达式编写的,所以有人告诉我。

The syntax is generally of the "if this variable is n then set that variable to m" style.语法通常是“如果这个变量是 n 那么将该变量设置为 m”的风格。

My grammar works for almost all cases, but there are a few places where it baulks because of a (mis)use of the && (logical and) operator.我的语法几乎适用于所有情况,但有几个地方由于(错误)使用了&& (逻辑与)运算符而出现问题。

My Lark grammar (which is LALR(1)) is:我的 Lark 语法(即 LALR(1))是:

?start: statement*

?statement: expression ";"

?expression : assignment_expression

?assignment_expression : conditional_expression
                       | primary_expression assignment_op assignment_expression

?conditional_expression : logical_or_expression
                        | logical_or_expression "?" expression (":" expression)?

?logical_or_expression : logical_and_expression
                       | logical_or_expression "||" logical_and_expression

?logical_and_expression : equality_expression
                        | logical_and_expression "&&" equality_expression

?equality_expression : relational_expression
                     | equality_expression equals_op relational_expression
                     | equality_expression not_equals_op relational_expression

?relational_expression : additive_expression
                       | relational_expression less_than_op additive_expression
                       | relational_expression greater_than_op additive_expression
                       | relational_expression less_than_eq_op additive_expression
                       | relational_expression greater_than_eq_op additive_expression

?additive_expression : multiplicative_expression
                     | additive_expression add_op multiplicative_expression
                     | additive_expression sub_op multiplicative_expression

?multiplicative_expression : primary_expression
                           | multiplicative_expression mul_op primary_expression
                           | multiplicative_expression div_op primary_expression
                           | multiplicative_expression mod_op primary_expression

?primary_expression : variable
                    | variable "[" INT "]"    -> array_accessor
                    | ESCAPED_STRING
                    | NUMBER
                    | unary_op expression
                    | invoke_expression
                    | "(" expression ")"

invoke_expression : ID ("." ID)* "(" argument_list? ")"
argument_list : expression ("," expression)*

unary_op : "-" -> negate_op
         | "!" -> invert_op
assignment_op : "="
add_op : "+"
sub_op : "-"
mul_op : "*"
div_op : "/"
mod_op : "%"
equals_op : "=="
not_equals_op : "!="
greater_than_op : ">"
greater_than_eq_op : ">="
less_than_op : "<"
less_than_eq_op : "<="

ID : CNAME | CNAME "%%" CNAME

?variable : ID
    | ID "@" ID           -> namelist_id
    | ID "@" ID "@" ID    -> exptype_id
    | "$" ID              -> environment_id

%import common.WS
%import common.ESCAPED_STRING
%import common.CNAME
%import common.INT
%import common.NUMBER
%import common.CPP_COMMENT

%ignore WS
%ignore CPP_COMMENT

And some working examples are:一些工作示例是:

(a == 2) ? (c = 12);
(a == 2 && b == 3) ? (c = 12);
(a == 2 && b == 3) ? (c = 12) : d = 13;
(a == 2 && b == 3) ? ((c = 12) && (d = 13));

But there are a few places where I see this construct:但是我在几个地方看到了这个结构:

(a == 2 && b == 3) ? (c = 12 && d = 13);

That is, the two assignments are joined by && but aren't in parentheses and it doesn't like the second assignment operator.也就是说,这两个赋值由&&连接但不在括号中,并且它不喜欢第二个赋值运算符。 I assume this is because it's trying to parse it as (c = (12 && d) = 13)我认为这是因为它试图将其解析为(c = (12 && d) = 13)

I've tried changing the order of the rules (this is my first non-toy DSL, so there's been a lot of trial and error), but I either get similar errors or the precedence is wrong.我试过改变规则的顺序(这是我的第一个非玩具 DSL,所以有很多试验和错误),但我要么得到类似的错误,要么优先级错误。 And the Earley algorithm doesn't fix it.而 Earley 算法并没有解决它。

Instead of:代替:

?assignment_expression : conditional_expression
                       | primary_expression assignment_op assignment_expression

?conditional_expression : logical_or_expression
                        | logical_or_expression "?" expression (":" expression)?

?logical_or_expression : logical_and_expression
                       | logical_or_expression "||" logical_and_expression

?logical_and_expression : equality_expression
                        | logical_and_expression "&&" equality_expression

?equality_expression : relational_expression
                     | equality_expression equals_op relational_expression
                     | equality_expression not_equals_op relational_expression

?relational_expression : additive_expression
                       | relational_expression less_than_op additive_expression
                       | relational_expression greater_than_op additive_expression
                       | relational_expression less_than_eq_op additive_expression
                       | relational_expression greater_than_eq_op additive_expression

?additive_expression : multiplicative_expression
                     | additive_expression add_op multiplicative_expression
                     | additive_expression sub_op multiplicative_expression

?multiplicative_expression : primary_expression
                           | multiplicative_expression mul_op primary_expression
                           | multiplicative_expression div_op primary_expression
                           | multiplicative_expression mod_op primary_expression

try:尝试:

?assignment_expression : conditional_expression
                       | primary_expression assignment_op expression

?conditional_expression : logical_or_expression
                        | logical_or_expression "?" expression (":" expression)?

?logical_or_expression : logical_and_expression
                       | logical_or_expression "||" expression

?logical_and_expression : equality_expression
                        | logical_and_expression "&&" expression

?equality_expression : relational_expression
                     | equality_expression equals_op expression
                     | equality_expression not_equals_op expression

?relational_expression : additive_expression
                       | relational_expression less_than_op expression
                       | relational_expression greater_than_op expression
                       | relational_expression less_than_eq_op expression
                       | relational_expression greater_than_eq_op expression

?additive_expression : multiplicative_expression
                     | additive_expression add_op expression
                     | additive_expression sub_op expression

?multiplicative_expression : primary_expression
                           | multiplicative_expression mul_op expression
                           | multiplicative_expression div_op expression
                           | multiplicative_expression mod_op expression

Thanks for all the help, but as of this morning the customer and I agreed that the offending lines of code will be fixed, rather than torturing the grammar to make them work.感谢您提供的所有帮助,但从今天早上起,客户和我同意修复有问题的代码行,而不是通过折磨语法来使它们正常工作。 There's only 9 out of 3300 lines of code that are ambiguous, so the extra effort and hackiness wasn't worth it. 3300 行代码中只有 9 行不明确,因此不值得付出额外的努力和黑客攻击。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM