简体   繁体   中英

Perl regex match longest sequence

I have a string like below

atom:link[@me="samiron" and @test1="t1" and @test2="t2"]

and I need a regular expression which will generate the following back references

#I would prefer to have
$1 = @test1
$2 = t1
$3 = @test2
$4 = t2

#Or at least. I will break these up in parts later on.
$1 = @test1="t1"
$2 = @test2="t2"

I've tried something like ( and [@\\w]+=["\\w]+)*\\] which returns only last match and @test2="t2" . Completely out of ideas. Any help?

Edit: actually the number of @test1="t1" pattern is not fixed. And the regex must fit the situation. Thnx @Pietzcker.

You can do it like this:

my $text = 'atom:link[@me="samiron" and @test1="t1" and @test2="t2"]';
my @results;
while ($text =~ m/and (@\w+)="(\w+)"/g) {
  push @results, $1, $2;
}
print Dumper \@results;

Result:

$VAR1 = [
          '@me',
          'samiron',
          '@test1',
          't1',
          '@test2',
          't2'
        ];

This will give you hash which maps "@test1" => "t1" and so on:

my %matches = ($str =~ /and (\@\w+)="(\w+)"/g);

Explanation: /g global match will give you an array of matches like "@test1", "t1", "@test2", "t2", ...

When hash %matches is assigned to this array, perl will automatically convert array to hash by treating it as key-value pairs. As a result, hash %matches will contain what are you looking for in nice hash format.

When you use a repeating capturing group, each new match will overwrite any previous match.

So you can only do a "find all" with a regex like

@result = $subject =~ m/(?<= and )([@\w]+)=(["\w]+)(?= and |\])/g;

to get an array of all matches.

This works for me:

@result = $s =~ /(@(?!me).*?)="(.*?)"/g;
foreach (@result){
    print "$_\n";
}

The output is:

@test1
t1
@test2
t2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM