简体   繁体   English

如何匹配 Raku 语法中的换行符或文件结尾?

[英]How do I match a newline or the end of a file in a Raku grammar?

I have run into headaches trying to coerce a grammar to match the last line of a file if it is not followed by a newline:如果文件后面没有换行符,我试图强制语法匹配文件的最后一行时遇到了头疼的问题:

Line 1
Line 2 EOF

This attempted solution, which makes the newline optional, causes an infinite loop:这个尝试的解决方案使换行成为可选,导致无限循环:

my grammar HC4 {
    token TOP {  <line>+ }
    token line { [ <header> | <not-header> ] \n? } # optional newline

    token header { <header-start> <header-content> }
    token not-header { <not-header-content> }
    token header-start { \s* '#' ** 1..6 }
    token header-content { \N* }
    token not-header-content { \N* }
}

The \N* bits will match the '' string after the last character in the last line forever. \N*位将永远匹配最后一行中最后一个字符之后的''字符串。

I have tried using <[\n\Z]> but then the compiler complains and suggests using \n?$ which I tried but that does not work either.我曾尝试使用<[\n\Z]>但编译器抱怨并建议使用我尝试过的\n?$但这也不起作用。 After a lot of trial and error, the only solution I discovered that works requires me to create a new <blank> capture and to change the \N* to \N+ :经过大量试验和错误,我发现唯一可行的解​​决方案需要我创建一个新的<blank>捕获并将\N*更改为\N+

my grammar HC3 {
    token TOP {  <line>+ }
    token line { [ <header> | <blank> | <not-header> ] \n? }

    token header { <header-start> <header-content> }
    token blank { \h* <?[\n]> }
    token not-header { <not-header-content> }
    token header-start { \s* '#' ** 1..6 }
    token header-content { \N+ }
    token not-header-content { \N+ }
}

I'd like to know if there is a more straightforward accomplishing this, though.不过,我想知道是否有更直接的方法来实现这一点。 Thanks.谢谢。

I think I may have found something that can work and is simple:我想我可能找到了一些可行且简单的方法:

my grammar G {
    token TOP {  (^^ <line>)+ }
    token line { \N* \n? }
}

The ^^ symbol, for the beginning of a line, stops the infinite loop. ^^符号,作为行首,停止无限循环。

OK, after some investigation, I discovered the root cause of my woes:好的,经过一番调查,我发现了我的困境的根本原因:

在此处输入图像描述

This screenshot is from the IntelliJ IDE's Editor -> General settings.此屏幕截图来自 IntelliJ IDE 的编辑器 -> 常规设置。 By default, the "Ensure every saved file ends with a line break" is not checked off.默认情况下,“确保每个保存的文件都以换行符结尾”未选中。 So if I saved a file with the very last line deleted to clean it up, it was stripping the last \n character.因此,如果我保存了一个文件,并删除了最后一行来清理它,那么它就是在剥离最后一个\n字符。 Check that setting on to avoid my pain, suffering and deep psychological trauma.检查该设置以避免我的痛苦,痛苦和深刻的心理创伤。

I believe the simplest solution is something like this:我相信最简单的解决方案是这样的:

grammar LineOriented {
    token TOP {
        <line>* %% \n
    }

    token line {
        ^^ \N*
    }
}

Using %% allows, but not requires, the last trailing line.使用%%允许但不要求最后一行。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM