简体   繁体   English

Perl中的正则表达式分组

[英]Regular expression grouping in Perl

I have variable contains: No such file or directory at ./EMSautoInstall.pl line 50. 我有变量包含: No such file or directory at ./EMSautoInstall.pl line 50.

I want to create variable contains No such file or directory and another one contains at ./EMSautoInstall.pl line 50. 我想创建变量包含No such file or directory ,另一个包含at ./EMSautoInstall.pl line 50.

my REGEX is: my ( $eStmnt, $lineNO ) = $! =~ /(.*[^a][^t])(.*)/; 我的REGEX是: my ( $eStmnt, $lineNO ) = $! =~ /(.*[^a][^t])(.*)/; my ( $eStmnt, $lineNO ) = $! =~ /(.*[^a][^t])(.*)/;

when I print both variable, the first one contains No such file or directory but the second one is empty. 当我打印两个变量时,第一个包含No such file or directory但第二个是空的。

Why this happen? 为什么会这样?

Do you really have that string in the $! 真的$!有那个字符串$! variable? 变量? Because normally, the at line... part is added by die and warn . 因为通常, at line...部分由die添加并warn I suspect you simply have 我怀疑你只是

$! = "No such file or directory";

And your regex matches because it allows the empty string 你的正则表达式匹配,因为它允许空字符串

/(.*[^a][^t])(.*)/

Ie the second capture also matches nothing, and the first capture can be anything that does not end with at . 即第二次捕获也没有任何匹配,第一次捕获可以是任何不at

To confirm, 确认,

print $!;

Should print No such file or directory . 应该打印No such file or directory

在这里使用split with lookahead断言比正则表达式捕获更有意义:

my ( $eStmnt, $lineNO ) = split /(?=at)/, $!;

You can use this: 你可以用这个:

((?:[^a]+|\Ba|a(?!t\b))+)(.*)

the idea is to match all that is not a "a" or a "a" that is not a part of the word "at" 我的想法是匹配所有不是“a”或“a”的东西,而不是“at”这个词的一部分

details: 细节:

(                 # first capturing group
    (?:           # open a non capturing group
        [^a]+     # all that is not a "a" one or more times
      |           # OR
        \Ba       # a "a" not preceded by a word boundary
      |           # OR
        a(?!t\b)  # "a" not followed by "t" and a word boundary
    )+            # repeat the non capturing group 1 or more times
)                 # close the capturing group
(.*)              # the second capturing group  

You can improve this pattern replacing the non-capturing group by an atomic group and the quantifiers by possessive quantifiers. 您可以改进此模式,用原子组替换非捕获组,用占有量量词替换量词。 The goal is to forbid the record by the regex engine of backtrack positions, but the result stay the same: 目标是通过回溯位置的正则表达式引擎禁止记录,但结果保持不变:

((?>[^a]++|\Ba|a(?!t\b))++)(.*+)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM