如何匹配 Raku 语法中的换行符或文件结尾？

Question

I have run into headaches trying to coerce a grammar to match the last line of a file if it is not followed by a newline:如果文件后面没有换行符，我试图强制语法匹配文件的最后一行时遇到了头疼的问题：

Line 1
Line 2 EOF

This attempted solution, which makes the newline optional, causes an infinite loop:这个尝试的解决方案使换行成为可选，导致无限循环：

my grammar HC4 {
    token TOP {  <line>+ }
    token line { [ <header> | <not-header> ] \n? } # optional newline

    token header { <header-start> <header-content> }
    token not-header { <not-header-content> }
    token header-start { \s* '#' ** 1..6 }
    token header-content { \N* }
    token not-header-content { \N* }
}

The \N* bits will match the '' string after the last character in the last line forever. \N*位将永远匹配最后一行中最后一个字符之后的''字符串。

I have tried using <[\n\Z]> but then the compiler complains and suggests using \n?$ which I tried but that does not work either.我曾尝试使用<[\n\Z]>但编译器抱怨并建议使用我尝试过的\n?$但这也不起作用。 After a lot of trial and error, the only solution I discovered that works requires me to create a new <blank> capture and to change the \N* to \N+ :经过大量试验和错误，我发现唯一可行的解决方案需要我创建一个新的<blank>捕获并将\N*更改为\N+ ：

my grammar HC3 {
    token TOP {  <line>+ }
    token line { [ <header> | <blank> | <not-header> ] \n? }

    token header { <header-start> <header-content> }
    token blank { \h* <?[\n]> }
    token not-header { <not-header-content> }
    token header-start { \s* '#' ** 1..6 }
    token header-content { \N+ }
    token not-header-content { \N+ }
}

I'd like to know if there is a more straightforward accomplishing this, though.不过，我想知道是否有更直接的方法来实现这一点。 Thanks.谢谢。

Answer 1

I think I may have found something that can work and is simple:我想我可能找到了一些可行且简单的方法：

my grammar G {
    token TOP {  (^^ <line>)+ }
    token line { \N* \n? }
}

The ^^ symbol, for the beginning of a line, stops the infinite loop. ^^符号，作为行首，停止无限循环。

Answer 2

OK, after some investigation, I discovered the root cause of my woes:好的，经过一番调查，我发现了我的困境的根本原因：

This screenshot is from the IntelliJ IDE's Editor -> General settings.此屏幕截图来自 IntelliJ IDE 的编辑器 -> 常规设置。 By default, the "Ensure every saved file ends with a line break" is not checked off.默认情况下，“确保每个保存的文件都以换行符结尾”未选中。 So if I saved a file with the very last line deleted to clean it up, it was stripping the last \n character.因此，如果我保存了一个文件，并删除了最后一行来清理它，那么它就是在剥离最后一个\n字符。 Check that setting on to avoid my pain, suffering and deep psychological trauma.检查该设置以避免我的痛苦，痛苦和深刻的心理创伤。

Answer 3

I believe the simplest solution is something like this:我相信最简单的解决方案是这样的：

grammar LineOriented {
    token TOP {
        <line>* %% \n
    }

    token line {
        ^^ \N*
    }
}

Using %% allows, but not requires, the last trailing line.使用%%允许但不要求最后一行。

如何匹配 Raku 语法中的换行符或文件结尾？

问题描述

3 个解决方案

解决方案1
3 2022-03-27 09:23:24

解决方案2
2 2022-03-29 19:20:02

解决方案3
0 2022-07-25 18:07:41

如何匹配 Raku 语法中的换行符或文件结尾？

问题描述

3 个解决方案

解决方案1 3 2022-03-27 09:23:24

解决方案2 2 2022-03-29 19:20:02

解决方案3 0 2022-07-25 18:07:41

解决方案1
3 2022-03-27 09:23:24

解决方案2
2 2022-03-29 19:20:02

解决方案3
0 2022-07-25 18:07:41