简体   繁体   English

IntelliJ:语法工具包/ BNF:如何从错误中恢复?

[英]IntelliJ: Grammar-Kit / BNF: how to recover from errors?

I am writing a Custom Language plugin for IntelliJ. 我正在为IntelliJ编写自定义语言插件。

Here is a simplified example of the language. 这是该语言的简化示例。 Note that the structure is recursive: 请注意,该结构是递归的:

在此处输入图片说明

I have successfully implemented the FLEX and BNF files, but I'm not sure how to add error recovery. 我已经成功实现了FLEX和BNF文件,但是我不确定如何添加错误恢复。 I've read about RecoverWhile and pin in Grammar-Kit's HOWTO, but I'm not sure how to apply them to my scenario. 我已经阅读了有关RecoverWhile的内容,并将其固定在Grammar-Kit的HOWTO中,但是我不确定如何将它们应用于我的方案。

I call the brown items above ("aaa", "ccc", etc...) " items ". 我将上面的棕色项目(“ aaa”,“ ccc”等)称为“ 项目 ”。

I call the yellow ones ("bbb", "ddd", ...) " properties ". 我把黄色的(“ bbb”,“ ddd”,...)称为“ properties ”。

Each item has an item name (eg "aaa"), a single property (eg "bbb"), and can contain other items (eg "aaa" contains "ccc", "eeee", and "gg"). 每个项目都有一个项目名称 (例如“ aaa”),单个属性 (例如“ bbb”),并且可以包含其他项目(例如“ aaa”包含“ ccc”,“ eeee”和“ gg”)。

At the moment, the plugin doesn't behave well when an item is malformed. 目前,当某个项目格式错误时,该插件的运行状况并不理想。 For example: 例如:

在此处输入图片说明

In this example, I would like the parser to "understand" that "ccc" is the name of an item with a missing property (eg by detecting a newline before the closing bracket). 在此示例中,我希望解析器“了解”“ ccc”是具有缺少属性的项目的名称(例如,通过在右括号之前检测换行符)。

I don't want the broken "ccc" item to influence the parsing of "eeee" (but I do want the PSI tree to have the elements of "ccc" that are present in the text, in this case - its name). 我不希望损坏的“ ccc” 影响“ eeee”的解析(但我确实希望PSI树具有文本中存在的“ ccc”元素,在这种情况下-其名称)。

Here are the FLEX and BNF that I use: 这是我使用的FLEX和BNF:

FLEX: 柔性:

CRLF= \n|\r|\r\n
WS=[\ \t\f]
WORD=[a-zA-Z0-9_#\-]+

%state EOF

%%
<YYINITIAL>    {WORD} { yybegin(YYINITIAL); return MyLangTypes.TYPE_FLEX_WORD; }
<YYINITIAL>    \[     { yybegin(YYINITIAL); return MyLangTypes.TYPE_FLEX_OPEN_SQUARE_BRACKET; }
<YYINITIAL>    \]     { yybegin(YYINITIAL); return MyLangTypes.TYPE_FLEX_CLOSE_SQUARE_BRACKET; }
<YYINITIAL>    \{     { yybegin(YYINITIAL); return MyLangTypes.TYPE_FLEX_OPEN_CURLY_BRACKET; }
<YYINITIAL>    \}     { yybegin(YYINITIAL); return MyLangTypes.TYPE_FLEX_CLOSE_CURLY_BRACKET; }
({CRLF}|{WS})+        { return TokenType.WHITE_SPACE; }
{WS}+                 { return TokenType.WHITE_SPACE; }
.                     { return TokenType.BAD_CHARACTER; }

BNF: BNF:

myLangFile ::= (item|COMMENT|CRLF)
item ::=
    itemName
    (TYPE_FLEX_OPEN_SQUARE_BRACKET itemProperty? TYPE_FLEX_CLOSE_SQUARE_BRACKET?)?
    itemBody?
itemName ::= TYPE_FLEX_WORD
itemProperty ::= TYPE_FLEX_WORD
itemBody ::= TYPE_FLEX_OPEN_CURLY_BRACKET item* TYPE_FLEX_CLOSE_CURLY_BRACKET

I was eventually able to make it work like this: 我最终能够使它像这样工作:

myLangFile ::= (item|COMMENT|CRLF)
item ::=
    itemName
    itemProperties
    itemBody?
itemName ::= TYPE_FLEX_WORD
itemProperties ::= TYPE_FLEX_OPEN_SQUARE_BRACKET [!TYPE_FLEX_CLOSE_SQUARE_BRACKET itemProperty ((TYPE_FLEX_SEMICOLON itemProperty)|itemProperty)*] TYPE_FLEX_CLOSE_SQUARE_BRACKET {
    pin(".*") = 1
}
itemProperty ::= TYPE_FLEX_WORD TYPE_FLEX_EQUALS? itemPropertyValue? (TYPE_FLEX_EQUALS prv_swallowNextPropertyToPreventSyntaxErrors)?
private prv_swallowNextPropertyToPreventSyntaxErrors ::= TYPE_FLEX_WORD
itemPropertyValue ::= TYPE_FLEX_WORD
itemBody ::= TYPE_FLEX_OPEN_CURLY_BRACKET item* TYPE_FLEX_CLOSE_CURLY_BRACKET

It's not perfect; 这并不完美; for example, it allows to separate item properties with space (and not just with a semi-colon) but it does seem to solve the more important problem. 例如,它允许用空格(而不是仅用分号)分隔项目属性,但它确实解决了更重要的问题。

This may also be of interest: https://github.com/JetBrains/Grammar-Kit/blob/master/resources/messages/attributeDescriptions/recoverWhile.html 这可能也很有趣: https : //github.com/JetBrains/Grammar-Kit/blob/master/resources/messages/attributeDescriptions/recoverWhile.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM