I want to set a pattern which will find a capture group limited by the first occurrence of the “boundary”. But now the last boundary is used.
Eg:
String text = "this should match from A to the first B and not 2nd B, got that?";
Pattern ptrn = Pattern.compile("\\b(A.*B)\\b");
Matcher mtchr = ptrn.matcher(text);
while(mtchr.find()) {
String match = mtchr.group();
System.out.println("Match = <" + match + ">");
}
prints:
"Match = <A to the first B and not 2nd B>"
and I want it to print:
"Match = <A to the first B>"
What do I need to change within the pattern?
Make your *
non-greedy / reluctant using *?
:
Pattern ptrn = Pattern.compile("\\b(A.*?B)\\b");
By default, the pattern will behave greedily, and match as many characters as possible to satisfy the pattern, that is, up until the last B .
See Reluctant Quantifiers from the docs , and this tutorial .
不要使用贪婪表达式进行匹配,即:
Pattern ptrn = Pattern.compile("\\b(A.*?B)\\b");
*
is greedy quantifier that matches as many characters as possible to satisfy the pattern. Up to the last B
occurrence in your example. That is why you need to use reluctant one: *?
, that will mach as few characters as possible. So, your pattern should be slightly changed:
Pattern ptrn = Pattern.compile("\\b(A.*?B)\\b");
See “reluctant quantifiers” in the docs , and this tutorial .
也许比让*
不情愿/懒惰更明确的是说你正在寻找A,然后是一堆不是B的东西,接着是B:
Pattern ptrn = Pattern.compile("\\b(A[^B]*B)\\b");
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.