Perl regex matching too broadly

Question

I have a strings taken from Linux mail logs that look something like :

May 20 12:19:28 example-03 amavis[1445]: (01445-15) Passed SPAMMY {RelayedTaggedInbound}, [10.4.3.2]:49488 [10.4.3.2] <offers-john=example.com@example.net> -> <john@example.com>, Queue-ID: C00OZs0w9DB, Message-ID: <5ZCfDBMQyiUjOVD78ZFxg5%3D%3D@example.net>, mail_id: aCUpU0wtUaR, Hits: 15.587, size: 21407, queued_as: dgzikuucQ9i, 438 ms

The element I need to extract is :

<offers-john=example.com@example.net> -> <john@example.com>

I want to keep my regex as simple and clear as possible, so I don't want to go into regex for email address formats. Not least because regexing email formats is a bug-prone process !

I have tried :

$row =~ /(<.*> -> <.*>,)/;

But, despite the presence of the comma delimiter, that syntax matches all the way to the end of the end of Message-ID with an output such as :

<offers-john=example.com@example.net> -> <john@example.com>, Queue-ID: C00OZs0w9DB, Message-ID: <5ZCfDBMQyiUjOVD78ZFxg5%3D%3D@example.net>,

Answer 1

You need to make it non-greedy by adding ? to your regex :

(<.*?> -> <.*?>)

Demo

Answer 2

By default the quantifier * is greedy. It matches as much as it can, you need to make it lazy (aka non-greedy) by adding a ? after it. Here is an example .

Answer 3

That is much more robustly written without the non-greedy option, and it is clearer if insignificant whitespace is added with the help of the /x modifier. Like so

$row =~ / ( <[^<>]*> \s* -> \s* <[^<>]*> ) /x;

Perl regex matching too broadly

Question

3 answers

solution1
3 ACCPTED 2015-05-30 14:37:39

solution2
2 2015-05-30 14:38:49

solution3
1 2015-05-30 16:06:01

Perl regex matching too broadly

Question

3 answers

solution1 3 ACCPTED 2015-05-30 14:37:39

solution2 2 2015-05-30 14:38:49

solution3 1 2015-05-30 16:06:01

solution1
3 ACCPTED 2015-05-30 14:37:39

solution2
2 2015-05-30 14:38:49

solution3
1 2015-05-30 16:06:01