[英]Why does my Perl regex cause an infinite loop?
我有一些代码可以抓住某些文本的“之间”; 特别是在foo $someword
和下一个foo $someword
。
然而,发生的事情是它被卡在第一个“之间”并且不知何故内部字符串位置不会增加。
输入数据是一个带有换行符的文本文件:它们相当无关紧要,但使打印更容易。
my $component = qr'foo (\w+?)\s*?{';
while($text =~ /$component/sg)
{
push @baz, $1; #grab the $someword
}
my $list = join( "|", @baz);
my $re = qr/$list/; #create a list of $somewords
#Try to grab everything between the foo $somewords;
# or if there's no $foo someword, grab what's left.
while($text=~/($re)(.+?)foo ($re|\z|\Z)/ms)
#if I take out s, it doesn't repeat, but nothing gets grabbed.
{
# print pos($text), "\n"; #this is undef...that's a clue I'm certain.
print $1, ":", $2; #prints the someword and what was grabbed.
print "\n", '-' x 20, "\n";
}
更新:还有一个更新来处理要提取的文本中出现的'foo'
:
use strict;
use warnings;
use File::Slurp;
my $text = read_file \*DATA;
my $marker = 'foo';
my $marker_re = qr/$marker\s+\w+\s*?{/;
while ( $text =~ /$marker_re(.+?)($marker_re|\Z)/gs ) {
print "---\n$1\n";
pos $text -= length $2;
}
__DATA__
foo one {
one1
one2
one3
foo two
{ two1 two2
two3 two4 }
that was the second one
foo three { 3
foo 3 foo 3
foo 3
foo foo
foo four{}
输出:
--- one1 one2 one3 --- two1 two2 two3 two4 } that was the second one --- 3 foo 3 foo 3 foo 3 foo foo --- }
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.