简体   繁体   English

如何在 Raku 中使用匹配的分隔符

[英]How to use matching delimiters in Raku

I'm trying to write a token that allows nested content with matching delimiters.我正在尝试编写一个令牌,允许使用匹配的分隔符嵌套内容。 Where (AB) should result in a match to at least "AB" if not "(AB)".如果不是“(AB)”,则 (AB) 应至少匹配“AB”。 And (A(c)B) would return two matches "(A(c)B)" and so on.并且 (A(c)B) 将返回两个匹配项“(A(c)B)”等等。

Code boiled down from its source:代码从其来源归结为:

#!/home/hsmyers/rakudo741/bin/perl6
use v6d;

my @tie;

class add-in {
    method tie($/) { @tie.push: $/; }
}

grammar tied {
    rule TOP { <line>* }
    token line {
        <.ws>?
        [
            | <tie>
            | <simpleNotes>
        ]+
        <.ws>?
    }
    token tie {
        [
            || <.ws>? <simpleNotes>+ <tie>* <simpleNotes>* <.ws>?
            || <openParen> ~ <closeParen> <tie>
        ]+
    }
    token openParen { '(' }
    token closeParen { ')' }
    token simpleNotes {
        [
            | <[A..Ga..g,'>0..9]>
            | <[|\]]>
            | <blank>
        ]
    }
}

my $text = "(c2D) | (aA) (A2 | B)>G A>F G>E (A,2 |\nD)>F A>c d>f |]";

tied.parse($text, actions => add-in.new).say;
$text.say;
for (@tie) {
    s:g/\v/\\n/;
    say "«$_»";
}

This gives a partially correct result of:这给出了部分正确的结果:

«c2D»
«aA»
«(aA)»
«A2 | B»
«\nD»
«A,2 |\nD»
«(A,2 |\nD)>F A>c d>f |]»
«(c2D) | (aA) (A2 | B)>G A>F G>E (A,2 |\nD)>F A>c d>f |]»

BTW, I'm not concerned about the newline, it is there only to check if the approach can span text over two lines.顺便说一句,我不关心换行符,它只是检查该方法是否可以跨越两行文本。 So stirring the ashes I see captures with and without parenthesis, and a very greedy capture or two.所以搅拌灰烬我看到有括号和没有括号的捕获,以及一两个非常贪婪的捕获。

Clearly I have a problem within my code.显然我的代码有问题。 My knowledge of perl6 can best be described as "beginner" So I ask for your help.我对 perl6 的了解可以用“初学者”来形容,所以我请求你的帮助。 I'm looking for a general solution or at least an example that can be generalized and as always suggestions and corrections are welcome.我正在寻找一个通用的解决方案或至少一个可以概括的例子,并且一如既往地欢迎建议和更正。

There are a few added complexities that you have.您还有一些额外的复杂性。 For instance, you define a tie as being either (...) or just the ... .例如,您将tie定义为(...)或只是... But that inner contents is identical to the line.但是内部内容与该行相同。

Here's a rewritten grammar that greatly simplifies what you want.这是一个重写的语法,它大大简化了你想要的东西。 When writing grammars, it's helpful to start from the small and go up.写语法的时候,从小事做起,往上走是很有帮助的。

grammar Tied {
    rule  TOP   { <notes>+ %% \v+ }
    token notes {
        [
        | <tie>
        | <simple-note>
        ] + 
        %%
        <.ws>?
    }
    token open-tie    { '(' }
    token close-tie   { ')' }
    token tie         { <.open-tie> ~ <.close-tie> <notes> }
    token simple-note { <[A..Ga..g,'>0..9|\]]>             }
}

A few stylistic notes here.这里有一些文体注释。 Grammars are classes, and it's customary to capitalize them.语法是类,习惯上将它们大写。 Tokens are methods, and tend to be lower case with kebap casing (you can of course use any type you want, though).令牌是方法,并且往往是带有 kebap 外壳的小写字母(当然,您可以使用任何您想要的类型)。 In the tie token, you'll notice that I used <.open-tie> .tie令牌中,您会注意到我使用了<.open-tie> The . . means that we don't need to capture it (that is, we're just using it for matching and nothing else).意味着我们不需要捕获它(也就是说,我们只是将它用于匹配而没有其他用途)。 In the notes token I was able to simplify things a lot by using the %% and making TOP a rule which auto adds some whitespace.notes标记中,我能够通过使用%%并使TOP成为自动添加一些空格的规则来简化很多事情。

Now, the order that I would create the tokens is this:现在,我创建令牌的顺序是这样的:

  1. <simple-note> because it's the most base level item. <simple-note>因为它是最基本的项目。 A group of them would be他们中的一群人将是
  2. <notes> , so I make that next. <notes> ,所以我接下来做。 While doing that, I realize that a run of notes can also include a…在这样做的同时,我意识到一系列笔记还可以包括……
  3. <tie> , so that's the next one. <tie> ,这是下一个。 Inside of a tie though I'm just going to have another run of notes, so I can use <notes> inside it.在领带里面,虽然我只是要再写一串笔记,所以我可以在里面使用<notes>
  4. <TOP> at last, because if a line just has a run of notes, we can omit line and use %% \\v+ <TOP>最后,因为如果一行只有一串音符,我们可以省略一行并使用%% \\v+

Actions (often given the same name as your grammar, plus -Actions , so here I use class Tied-Actions { … } ) are normally used to create an abstract syntax tree.动作(通常与你的语法同名,加上-Actions ,所以这里我使用class Tied-Actions { … } )通常用于创建抽象语法树。 But really, the best way to think of this is asking each level of the grammar what we want from it.但实际上,考虑这个问题的最好方法是询问语法的每个级别我们想要什么。 I find that whereas writing grammars it's easiest to build from the smallest element up, for actions, it's easiest to go from the TOP down.我发现在编写语法时,从最小的元素向上构建最容易,而对于动作,从顶部向下构建最容易。 This will also help you build more complex actions down the road:这也将帮助您构建更复杂的操作:

  1. What do we want from TOP ?我们想从TOP什么?
    In our case, we just want all the ties that we found in each <note> token.在我们的例子中,我们只想要我们在每个<note>标记中找到的所有关系。 That can be done with a simple loop (because we did a quantifier on <notes> it will be Positional :这可以通过一个简单的循环来完成(因为我们在<notes>上做了一个量词,它将是Positional
    method TOP ($/) { my @ties; @ties.append: .made for $<notes>; make @ties; }
    The above code creates our temp variable, loops through each <note> and appends on everything that <note> made for us — which is nothing at the moment, but that's okay.上面的代码创建了我们的临时变量,遍历每个<note>并附加在<note>为我们制作的所有内容上——目前还没有,但没关系。 Then, because we want the ties from TOP, so we make them, which allows us to access it after parsing.然后,因为我们想要来自 TOP 的关系,所以我们make它们,这允许我们在解析后访问它。
  2. What do you want from <notes> ?你想从<notes>什么?
    Again, we just want the ties (but maybe some other time, you want ties and glisses, or some other information).同样,我们只想要领带(但也许在其他时候,您想要领带和滑索,或其他一些信息)。 So we can grab the ties basically doing the exact same thing:所以我们可以抓住关系基本上做完全相同的事情:
    method notes ($/) { my @ties; @ties.append: .made for $<tie>.grep(*.defined); make @ties; }
    The only differences is rather than doing just for $<tie> , we have to grab just the defined ones — this is a consequence of doing the [<foo>|<bar>]+ : $<foo> will have a slot for each quantified match, whether or note <foo> did the matching (this is when you would often want to pop things out to, say, proto token note with a tie and a simple note variant, but that's a bit advaned for this).唯一的区别是不是只for $<tie> ,我们必须只获取定义的 - 这是执行[<foo>|<bar>]+$<foo>将有一个插槽用于每个量化的匹配,无论是<foo>还是 note <foo>进行匹配(这是您经常想要将事物弹出到,例如,带有领带和简单音符变体的proto token note的情况,但这有点先进)。 Again, we grab the whatever $<tie> made for us — we'll define that later, and we "make" it.再一次,我们获取$<tie>为我们制作的任何东西——我们稍后会定义它,然后我们“制作”它。 Whatever we make is what other actions will find made by <notes> (like in TOP ).我们make就是其他操作会发现<notes> made的(如在TOP )。
  3. What do you want from <tie> ?你想从<tie>那里得到什么? Here I'm going to just go for the content of the tie — it's easy enough to grab the parentheses too if you want.在这里,我将只查看领带的内容 - 如果您愿意,也很容易抓住括号。 You'd think we'd just use make ~$<notes> , but that leaves off something important: $<notes> also has some ties.您可能认为我们使用make ~$<notes> ,但这遗漏了一些重要的东西: $<notes>有一些联系。 But those are easy enough to grab:但这些很容易抓住:
    method tie ($/) { my @ties = ~$<notes>; @ties.append: $<notes>.made; make @ties; }
    This ensures that we pass along not only the current outer tie, but also each individual inner tie (which in turn may haev another inner one, and so on).这确保我们不仅传递当前的外部关系,而且还传递每个单独的内部关系(这反过来可能有另一个内部关系,依此类推)。

When you parse, all you need to do is grab the .made of the Match :当您解析时,您需要做的就是获取Match.made

say Tied.parse("a(b(c))d");
# 「a(b(c))d」
# notes => 「a(b(c))d」
#  simple-note => 「a」
#  tie => 「(b(c))」          <-- there's a tie!
#   notes => 「b(c)」
#    simple-note => 「b」
#    tie => 「(c)」           <-- there's another!
#     notes => 「c」
#      simple-note => 「c」
#  simple-note => 「d」
say Tied.parse("a(b(c))d", actions => TiedActions).made;
# [b(c) c]

Now, if you really only will ever need the ties —and nothing else— (which I don't think is the case), you can things much more simply.现在,如果你真的只需要领带——而别无其他——(我认为不是这种情况),你可以更简单地做事情。 Using the same grammar, use instead the following actions:使用相同的语法,改为使用以下操作:

class Tied-Actions {
    has @!ties;
    method TOP ($/) { make @!ties            }
    method tie ($/) { @!ties.push: ~$<notes> }
}

This has several disadvantages over the previous: while it works, it's not very scalable.与以前的相比,这有几个缺点:虽然它有效,但它的可扩展性不是很强。 While you'll get every tie, you won't know anything about its context.虽然你会得到每一条领带,但你对它的背景一无所知。 Also, you have to instantiate Tied-Actions (that is, actions => TiedActions.new ), whereas if you can avoid using any attributes, you can pass the type object.此外,您必须实例化 Tied-Actions(即actions => TiedActions.new ),而如果您可以避免使用任何属性,则可以传递类型对象。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM