简体   繁体   中英

Pattern matcher using Greedy and Reluctant

In java regex I have read about Greedy and Reluctant Quantifiers . They mentioned as

A reluctant or "non-greedy" quantifier first matches as little as possible. So the .* matches nothing at first, leaving the entire string unmatched

In this example

source: yyxxxyxx
pattern: .*xx

greedy quantifier * and produces

0 yyxxxyxx

reluctant qualifier *? , and we get the following:

0 yyxx
4 xyxx

Why result of yxx , yxx not possible even it is the smallest possible value?

The regex engine returns the first and leftmost match it find as a result.

Basically it tries to match the pattern starting from the first character. If it doesn't find a corresponding match, the transmission jumps in and it tries again from the second character, and so on.

If you use a+?b on bab it will first try from the first b . That doesn't work, so we try from the second character.

But here it finds a match right from the first character. Starting from the second isn't even considered, we found a match so we return.

If you apply a+?b on aab , we try at the first a and find an overall match: end of story, no reason to try anything else.

To sum up : the regex engine goes from the left to the right, so laziness can only affect the right side length.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM