Find all lines using regular expression

Question

There is a text like this (many lines)

1. sdfsdf werwe werwemax45 rwrwerwr
2. 34348878 max max44444445666 sdf
3. 4353424 23423eedf max55 dfdg dfgdf
4. max45
5. 4324234234sdfsdf maxx34534

Using regular expressions I need to find all lines and include a word max<digits> (containing digits instead of literally <digits> ) into a matching group.

So I've tried this regular expression:

^.*?\b(max\d+)\b.*?$

But it finds only lines containing max... and ignores others.

Then I've tried

^.*?\b(max\d+)?\b.*?$

It finds all lines but without matching group containing max... .

Answer 1

The issue can be "debugged" with a slightly modified pattern, ^(.*?)\\b(max\\d+)?\\b(.*?)$ , with the rest of the pattern wrapped into separate capturing groups. You can see that the lines are all matched by the Group 3 pattern, the last .*? . It happens because the first .*? is skipped (since it is a lazy pattern), then (max\\d=)? matches an empty string at the start of the line (none begins with max + digits - but if any line starts with that pattern, you would get it captured ), and the last .*? captures the whole line.

You can fix it by wrapping the first part into a non-capturing optional group capturing the max\\d+ into an obligatory capturing group

^(?:.*?\b(max\d+)\b)?.*?$

Or even without ?$ at the end since .* will match greedily up to the end of the line:

^(?:.*?\b(max\d+)\b)?.*

See the regex demo

Details

^ - start of string (with m option, start of a line)
(?:.*?\\b(max\\d+)\\b)? - an optional non-capturing group:
- .*? - any 0+ chars, other than line break chars as few as possible
- \\b - a word boundary
- (max\\d+) - Group 1 (obligatory, will be tried once): max and 1+ digits
- \\b - a word boundary
.* - rest of the line

Find all lines using regular expression

Question

1 answers

solution1
3 ACCPTED 2017-10-18 07:59:42

Find all lines using regular expression

Question

1 answers

solution1 3 ACCPTED 2017-10-18 07:59:42

solution1
3 ACCPTED 2017-10-18 07:59:42