简体   繁体   English

在Perl中使用Quotemeta? 如何解决错误:“正则表达式中的嵌套量词; 标记为<— HERE”

[英]Quotemeta in perl? How to solve error: “Nested quantifiers in regex; marked by <— HERE”

The following is the code: 以下是代码:

my $vowels = "[aiou~NFKPQRIJ]";
my @diactok;
for $rx (@tokens) {
    $rx =~ s/.\K/$vowels?/g;
    if ($diac =~ /($rx)/) {
        push @diactok, $diac =~ /$rx/g;
    }
}

From this previous question: How do I tokenise a word given tokens that are subsumed incompletely in the word? 从上一个问题开始: 如果给定单词​​中不完全包含的单词,我如何对单词进行单词标记?

It's fine except for this error (I did "use diagnostics"): 除此错误外,其他都没问题(我“使用诊断程序”):

Nested quantifiers in regex; 正则表达式中的嵌套量词; marked by <-- HERE in m/(A[aiou~NFKPQRIJ]?l[aiou~NFKPQRIJ]?* <-- HERE [aiou~NFKPQRIJ]?y[aiou~N FKPQRIJ]?n[aiou~NFKPQRIJ]?)/ at tokenizeForCRFinput.pl line 47, line 28 8670 (#3) (F) You can't quantify a quantifier without intervening parentheses. 以<-HERE标记为m /(A [aiou〜NFKPQRIJ]?l [aiou〜NFKPQRIJ]?y [aiou〜N FKPQRIJ]?n [aiou〜NFKPQRIJ]? )/在tokenizeForCRFinput.pl第47行,第28行8670(#3)(F)您必须在没有括号的情况下对量词进行量化。 So things like ** or +* or ?* are illegal. 因此,诸如**或+ *或?*之类的东西都是非法的。 The <-- HERE shows in the regular expression about where the problem was discovered. <-HERE在正则表达式中显示发现问题的位置。

 Note that the minimal matching quantifiers, *?, +?, and ?? appear to be nested quantifiers, but aren't. See perlre. 

Uncaught exception from user code: Nested quantifiers in regex; 用户代码未捕获的异常:正则表达式中的嵌套量词; marked by <-- HERE in m/(A[aiou~NFKPQRIJ]?l [aiou~NFKPQRIJ]?* <-- HERE [aiou~NFKPQRIJ]?y[aiou~NFKPQRIJ]?n[aiou~NFKPQRIJ]?)/ at tokenizeForCRFinput.pl line 47, line 288670. at tokenizeForCRFinput.pl line 47 以<-HERE标记为m /(A [aiou〜NFKPQRIJ]?l [aiou〜NFKPQRIJ]?y [aiou〜NFKPQRIJ] n [aiou〜NFKPQRIJ]?n [aiou〜NFKPQRIJ]? /在tokenizeForCRFinput.pl第47行,288670行。在tokenizeForCRFinput.pl第47行

Line 47 is this one: 47行就是这个:

if ($diac =~ /($rx)/)

I tried quotemeta but that didn't work - maybe I'm using it wrong? 我尝试了quotemeta,但是没有用-也许我用错了吗? Some of the strings captured in $diac do indeed have special characters like '?' $diac捕获的某些字符串确实确实具有特殊字符,例如'?' and '*' . '*'

The line: 该行:

$rx =~ s/.\K/$vowels?/g;

Is the culprit, if you indeed have meta characters in @tokens . 如果您确实在@tokens包含元字符,则是罪魁祸首。 Try this: 尝试这个:

$rx =~ s/(.)/ quotemeta($1) . "$vowels?" /eg;

Note that you cannot quotemeta the whole regex, since you have meta characters in $vowels that are needed. 请注意,您不能在整个正则表达式中加引号,因为在$vowels中需要使用元字符。

The pattern is originally 图案原来是

(Al*yn)

You change it to 您将其更改为

(A[aiou~NFKPQRIJ]?l[aiou~NFKPQRIJ]?*[aiou~NFKP...

Like the nessage says, [aiou~NFKPQRIJ]?* is wrong. 就像态度所说, [aiou~NFKPQRIJ]?*是错误的。 You didn't specify what you want, so it's hard to give you a fix. 您没有指定所需的内容,因此很难为您提供解决方案。

Maybe you want 也许你想要

(A(?:[aiou~NFKPQRIJ]?)l(?:[aiou~NFKPQRIJ]?)*(?:[aiou~NFKP...

If so, just use 如果是这样,请使用

$rx =~ s/.\K/(?:$vowels?)/g;

Maybe you want 也许你想要

(A(?:[aiou~NFKPQRIJ]?)(?:l[aiou~NFKPQRIJ]?)*(?:[aiou~NFKP...

If so, you'd need a much better regex parser than /./ . 如果是这样,您将需要一个比/./更好的正则表达式解析器。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM