[英]Quotemeta in perl? How to solve error: “Nested quantifiers in regex; marked by <— HERE”
The following is the code: 以下是代码:
my $vowels = "[aiou~NFKPQRIJ]";
my @diactok;
for $rx (@tokens) {
$rx =~ s/.\K/$vowels?/g;
if ($diac =~ /($rx)/) {
push @diactok, $diac =~ /$rx/g;
}
}
From this previous question: How do I tokenise a word given tokens that are subsumed incompletely in the word? 从上一个问题开始: 如果给定单词中不完全包含的单词,我该如何对单词进行单词标记?
It's fine except for this error (I did "use diagnostics"): 除此错误外,其他都没问题(我“使用诊断程序”):
Nested quantifiers in regex;
正则表达式中的嵌套量词; marked by <-- HERE in m/(A[aiou~NFKPQRIJ]?l[aiou~NFKPQRIJ]?* <-- HERE [aiou~NFKPQRIJ]?y[aiou~N FKPQRIJ]?n[aiou~NFKPQRIJ]?)/ at tokenizeForCRFinput.pl line 47, line 28 8670 (#3) (F) You can't quantify a quantifier without intervening parentheses.
以<-HERE标记为m /(A [aiou〜NFKPQRIJ]?l [aiou〜NFKPQRIJ]?y [aiou〜N FKPQRIJ]?n [aiou〜NFKPQRIJ]? )/在tokenizeForCRFinput.pl第47行,第28行8670(#3)(F)您必须在没有括号的情况下对量词进行量化。 So things like ** or +* or ?* are illegal.
因此,诸如**或+ *或?*之类的东西都是非法的。 The <-- HERE shows in the regular expression about where the problem was discovered.
<-HERE在正则表达式中显示发现问题的位置。
Note that the minimal matching quantifiers, *?, +?, and ?? appear to be nested quantifiers, but aren't. See perlre.
Uncaught exception from user code: Nested quantifiers in regex;
用户代码未捕获的异常:正则表达式中的嵌套量词; marked by <-- HERE in m/(A[aiou~NFKPQRIJ]?l [aiou~NFKPQRIJ]?* <-- HERE [aiou~NFKPQRIJ]?y[aiou~NFKPQRIJ]?n[aiou~NFKPQRIJ]?)/ at tokenizeForCRFinput.pl line 47, line 288670. at tokenizeForCRFinput.pl line 47
以<-HERE标记为m /(A [aiou〜NFKPQRIJ]?l [aiou〜NFKPQRIJ]?y [aiou〜NFKPQRIJ] n [aiou〜NFKPQRIJ]?n [aiou〜NFKPQRIJ]? /在tokenizeForCRFinput.pl第47行,288670行。在tokenizeForCRFinput.pl第47行
Line 47 is this one: 47行就是这个:
if ($diac =~ /($rx)/)
I tried quotemeta but that didn't work - maybe I'm using it wrong? 我尝试了quotemeta,但是没有用-也许我用错了吗? Some of the strings captured in
$diac
do indeed have special characters like '?'
$diac
捕获的某些字符串确实确实具有特殊字符,例如'?'
and '*'
. 和
'*'
。
The line: 该行:
$rx =~ s/.\K/$vowels?/g;
Is the culprit, if you indeed have meta characters in @tokens
. 如果您确实在
@tokens
包含元字符,则是罪魁祸首。 Try this: 尝试这个:
$rx =~ s/(.)/ quotemeta($1) . "$vowels?" /eg;
Note that you cannot quotemeta the whole regex, since you have meta characters in $vowels
that are needed. 请注意,您不能在整个正则表达式中加引号,因为在
$vowels
中需要使用元字符。
The pattern is originally 图案原来是
(Al*yn)
You change it to 您将其更改为
(A[aiou~NFKPQRIJ]?l[aiou~NFKPQRIJ]?*[aiou~NFKP...
Like the nessage says, [aiou~NFKPQRIJ]?*
is wrong. 就像态度所说,
[aiou~NFKPQRIJ]?*
是错误的。 You didn't specify what you want, so it's hard to give you a fix. 您没有指定所需的内容,因此很难为您提供解决方案。
Maybe you want 也许你想要
(A(?:[aiou~NFKPQRIJ]?)l(?:[aiou~NFKPQRIJ]?)*(?:[aiou~NFKP...
If so, just use 如果是这样,请使用
$rx =~ s/.\K/(?:$vowels?)/g;
Maybe you want 也许你想要
(A(?:[aiou~NFKPQRIJ]?)(?:l[aiou~NFKPQRIJ]?)*(?:[aiou~NFKP...
If so, you'd need a much better regex parser than /./
. 如果是这样,您将需要一个比
/./
更好的正则表达式解析器。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.