简体   繁体   中英

Using multiline regexps in Perl 5

I read the manuals about using multiline regexes in Perl 5, but still cannot figure out why the following ones don't work as intended:

#!/usr/bin/perl

use v5.20;

my $s = <<'ENDSTR';
aaa       : AAA
bbb       : BBB
ccc       : CCC
ENDSTR

my $m = 'bbb';

my $a = $s =~ s/.*^$m *: (.*?)$.*/$1/rsm;
my $b = $s =~ s/[.\n]*?^$m *: (.*)$[.\n]*/$1/rm;

print "a: $a\n";
print "b: $b\n";

The intended output of the program is

a: BBB
b: BBB

But these regexes produce:

a: BBB
ccc       : CCC

b: aaa       : AAA
bbb       : BBB
ccc       : CCC  

How to correct these regexes in order to get the needed matches?

On perlmonks.org I was advised with the correct variant:

my $a = $1 if  $s =~ s/^$m *: (.*?)$/$1/rsm;
my $b = $1 if  $s =~ s/^$m *: (.*)$/$1/rm;

With the s flag you are allowing the . meta character to match line endings. Either remove it or change the .* at the end of the regex to .*?

I think it might be easier to split that string by both \\s*:\\s* and \\n . You can build a hash very easily with the output, although this approach won't work if you have : in one of your strings, while your regular expression does. The following code works for me:

#!/usr/bin/perl

use v5.20;

my $s = <<'ENDSTR';
aaa       : AAA
bbb       : BBB
ccc       : CCC
ENDSTR

my %hash = split(/(\s*:?\s*|\n)/, $s);
say $hash{'bbb'};

If you're trying to parse data in that format, you should try using Config::General , which can parse a simple configuration file format that's pretty similar to what you have, but also supports comments, blocks, and other cool things.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM