[英]How to use matching delimiters in Raku
I'm trying to write a token that allows nested content with matching delimiters.我正在尝试编写一个令牌,允许使用匹配的分隔符嵌套内容。 Where (AB) should result in a match to at least "AB" if not "(AB)".
如果不是“(AB)”,则 (AB) 应至少匹配“AB”。 And (A(c)B) would return two matches "(A(c)B)" and so on.
并且 (A(c)B) 将返回两个匹配项“(A(c)B)”等等。
Code boiled down from its source:代码从其来源归结为:
#!/home/hsmyers/rakudo741/bin/perl6
use v6d;
my @tie;
class add-in {
method tie($/) { @tie.push: $/; }
}
grammar tied {
rule TOP { <line>* }
token line {
<.ws>?
[
| <tie>
| <simpleNotes>
]+
<.ws>?
}
token tie {
[
|| <.ws>? <simpleNotes>+ <tie>* <simpleNotes>* <.ws>?
|| <openParen> ~ <closeParen> <tie>
]+
}
token openParen { '(' }
token closeParen { ')' }
token simpleNotes {
[
| <[A..Ga..g,'>0..9]>
| <[|\]]>
| <blank>
]
}
}
my $text = "(c2D) | (aA) (A2 | B)>G A>F G>E (A,2 |\nD)>F A>c d>f |]";
tied.parse($text, actions => add-in.new).say;
$text.say;
for (@tie) {
s:g/\v/\\n/;
say "«$_»";
}
This gives a partially correct result of:这给出了部分正确的结果:
«c2D»
«aA»
«(aA)»
«A2 | B»
«\nD»
«A,2 |\nD»
«(A,2 |\nD)>F A>c d>f |]»
«(c2D) | (aA) (A2 | B)>G A>F G>E (A,2 |\nD)>F A>c d>f |]»
BTW, I'm not concerned about the newline, it is there only to check if the approach can span text over two lines.顺便说一句,我不关心换行符,它只是检查该方法是否可以跨越两行文本。 So stirring the ashes I see captures with and without parenthesis, and a very greedy capture or two.
所以搅拌灰烬我看到有括号和没有括号的捕获,以及一两个非常贪婪的捕获。
Clearly I have a problem within my code.显然我的代码有问题。 My knowledge of perl6 can best be described as "beginner" So I ask for your help.
我对 perl6 的了解可以用“初学者”来形容,所以我请求你的帮助。 I'm looking for a general solution or at least an example that can be generalized and as always suggestions and corrections are welcome.
我正在寻找一个通用的解决方案或至少一个可以概括的例子,并且一如既往地欢迎建议和更正。
There are a few added complexities that you have.您还有一些额外的复杂性。 For instance, you define a
tie
as being either (...)
or just the ...
.例如,您将
tie
定义为(...)
或只是...
。 But that inner contents is identical to the line.但是内部内容与该行相同。
Here's a rewritten grammar that greatly simplifies what you want.这是一个重写的语法,它大大简化了你想要的东西。 When writing grammars, it's helpful to start from the small and go up.
写语法的时候,从小事做起,往上走是很有帮助的。
grammar Tied {
rule TOP { <notes>+ %% \v+ }
token notes {
[
| <tie>
| <simple-note>
] +
%%
<.ws>?
}
token open-tie { '(' }
token close-tie { ')' }
token tie { <.open-tie> ~ <.close-tie> <notes> }
token simple-note { <[A..Ga..g,'>0..9|\]]> }
}
A few stylistic notes here.这里有一些文体注释。 Grammars are classes, and it's customary to capitalize them.
语法是类,习惯上将它们大写。 Tokens are methods, and tend to be lower case with kebap casing (you can of course use any type you want, though).
令牌是方法,并且往往是带有 kebap 外壳的小写字母(当然,您可以使用任何您想要的类型)。 In the
tie
token, you'll notice that I used <.open-tie>
.在
tie
令牌中,您会注意到我使用了<.open-tie>
。 The .
.
means that we don't need to capture it (that is, we're just using it for matching and nothing else).意味着我们不需要捕获它(也就是说,我们只是将它用于匹配而没有其他用途)。 In the
notes
token I was able to simplify things a lot by using the %%
and making TOP
a rule which auto adds some whitespace.在
notes
标记中,我能够通过使用%%
并使TOP
成为自动添加一些空格的规则来简化很多事情。
Now, the order that I would create the tokens is this:现在,我创建令牌的顺序是这样的:
<simple-note>
because it's the most base level item. <simple-note>
因为它是最基本的项目。 A group of them would be<notes>
, so I make that next. <notes>
,所以我接下来做。 While doing that, I realize that a run of notes can also include a…<tie>
, so that's the next one. <tie>
,这是下一个。 Inside of a tie though I'm just going to have another run of notes, so I can use <notes>
inside it.<notes>
。<TOP>
at last, because if a line just has a run of notes, we can omit line and use %% \\v+
<TOP>
最后,因为如果一行只有一串音符,我们可以省略一行并使用%% \\v+
Actions (often given the same name as your grammar, plus -Actions
, so here I use class Tied-Actions { … }
) are normally used to create an abstract syntax tree.动作(通常与你的语法同名,加上
-Actions
,所以这里我使用class Tied-Actions { … }
)通常用于创建抽象语法树。 But really, the best way to think of this is asking each level of the grammar what we want from it.但实际上,考虑这个问题的最好方法是询问语法的每个级别我们想要什么。 I find that whereas writing grammars it's easiest to build from the smallest element up, for actions, it's easiest to go from the TOP down.
我发现在编写语法时,从最小的元素向上构建最容易,而对于动作,从顶部向下构建最容易。 This will also help you build more complex actions down the road:
这也将帮助您构建更复杂的操作:
TOP
?TOP
什么?<note>
token.<note>
标记中找到的所有关系。 That can be done with a simple loop (because we did a quantifier on <notes>
it will be Positional
:<notes>
上做了一个量词,它将是Positional
:method TOP ($/) { my @ties; @ties.append: .made for $<notes>; make @ties; }
<note>
and appends on everything that <note>
made for us — which is nothing at the moment, but that's okay.<note>
并附加在<note>
为我们制作的所有内容上——目前还没有,但没关系。 Then, because we want the ties from TOP, so we make
them, which allows us to access it after parsing.make
它们,这允许我们在解析后访问它。<notes>
?<notes>
什么?method notes ($/) { my @ties; @ties.append: .made for $<tie>.grep(*.defined); make @ties; }
for $<tie>
, we have to grab just the defined ones — this is a consequence of doing the [<foo>|<bar>]+
: $<foo>
will have a slot for each quantified match, whether or note <foo>
did the matching (this is when you would often want to pop things out to, say, proto token note
with a tie and a simple note variant, but that's a bit advaned for this).for $<tie>
,我们必须只获取定义的 - 这是执行[<foo>|<bar>]+
: $<foo>
将有一个插槽用于每个量化的匹配,无论是<foo>
还是 note <foo>
进行匹配(这是您经常想要将事物弹出到,例如,带有领带和简单音符变体的proto token note
的情况,但这有点先进)。 Again, we grab the whatever $<tie>
made for us — we'll define that later, and we "make" it.$<tie>
为我们制作的任何东西——我们稍后会定义它,然后我们“制作”它。 Whatever we make
is what other actions will find made
by <notes>
(like in TOP
).make
就是其他操作会发现<notes>
made
的(如在TOP
)。<tie>
?<tie>
那里得到什么? Here I'm going to just go for the content of the tie — it's easy enough to grab the parentheses too if you want.make ~$<notes>
, but that leaves off something important: $<notes>
also has some ties.make ~$<notes>
,但这遗漏了一些重要的东西: $<notes>
也有一些联系。 But those are easy enough to grab:method tie ($/) { my @ties = ~$<notes>; @ties.append: $<notes>.made; make @ties; }
When you parse, all you need to do is grab the .made
of the Match
:当您解析时,您需要做的就是获取
Match
的.made
:
say Tied.parse("a(b(c))d");
# 「a(b(c))d」
# notes => 「a(b(c))d」
# simple-note => 「a」
# tie => 「(b(c))」 <-- there's a tie!
# notes => 「b(c)」
# simple-note => 「b」
# tie => 「(c)」 <-- there's another!
# notes => 「c」
# simple-note => 「c」
# simple-note => 「d」
say Tied.parse("a(b(c))d", actions => TiedActions).made;
# [b(c) c]
Now, if you really only will ever need the ties —and nothing else— (which I don't think is the case), you can things much more simply.现在,如果你真的只需要领带——而别无其他——(我认为不是这种情况),你可以更简单地做事情。 Using the same grammar, use instead the following actions:
使用相同的语法,改为使用以下操作:
class Tied-Actions {
has @!ties;
method TOP ($/) { make @!ties }
method tie ($/) { @!ties.push: ~$<notes> }
}
This has several disadvantages over the previous: while it works, it's not very scalable.与以前的相比,这有几个缺点:虽然它有效,但它的可扩展性不是很强。 While you'll get every tie, you won't know anything about its context.
虽然你会得到每一条领带,但你对它的背景一无所知。 Also, you have to instantiate Tied-Actions (that is,
actions => TiedActions.new
), whereas if you can avoid using any attributes, you can pass the type object.此外,您必须实例化 Tied-Actions(即
actions => TiedActions.new
),而如果您可以避免使用任何属性,则可以传递类型对象。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.