Given the following:
"John Smith"
"John Smith (123)"
"John Smith (123) (456)"
I'd like to capture:
"John Smith"
"John Smith", "123"
"John Smith (123)", "456"
What Java regex would allow me to do that?
I've tried (.+)\\s\\((\\d+)\\)$
and it works fine for "John Smith (123)" and "John Smith (123) (456)" but not for "John Smith". How can I change the regex to work for the first input as well?
You may turn the first .+
lazy, and wrap the later part with a non-capturing optional group:
(.+?)(?:\s\((\d+)\))?$
^ ^^^ ^^
See the regex demo
Actually, if you are using the regex with String#matches()
the last $
is redundant.
Details :
(.+?)
- Group 1 capturing one or zero characters other than a linebreak symbol, as few as possible (thus, allowing the subsequent subpattern to "fall" into a group) (?:\\s\\((\\d+)\\))?
- an optional sequence of a whitespace, (
, Group 2 capturing 1+ digits and a )
$
- end of string anchor. A Java demo :
String[] lst = new String[] {"John Smith","John Smith (123)","John Smith (123) (456)"};
Pattern p = Pattern.compile("(.+?)(?:\\s\\((\\d+)\\))?");
for (String s: lst) {
Matcher m = p.matcher(s);
if (m.matches()) {
System.out.println(m.group(1));
if (m.group(2) != null)
System.out.println(m.group(2));
}
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.