I have following programm
public class PatternMatching {
public static void main(String[] args) {
String pattern ="a??";
Pattern pattern1 = Pattern.compile(pattern);
String findAgainst = "a";
Matcher matcher = pattern1.matcher(findAgainst);
int count=0;
while(matcher.find()){
count++;
System.out.println(matcher.group(0)+".start="+ matcher.start()+".end="+matcher.end());
}
System.out.println(count);
}
}
which prints following output
.start=0.end=0
.start=1.end=1
2
instead of
.start=0.end=0
a.start=0.end=1
.start=1.end=1
3
when I run the program with pattern "b??"
the output is
.start=0.end=0
.start=1.end=1
2
which is correct. What would be the reason for incorrect output eventhough it is a reluctant qualifier?
From what I see, the issue is that Java regex engine uses the following algorithm when encountering a zero-length match: it compares the index of the match to the current regex index, and if they coincide, the regex index is incremented.
Thus, when you matched the empty space before a
with a??
the regex engine found a zero-length match and incremented the index that appeared after a
, thus, skipping a correct match.
If you use a greedy version - a?
- the output will be different:
a.start=0.end=1
.start=1.end=1
2
It happens because the first a
was consumed, the regex engine index is after a
, and can now match the end-of-string.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.