Why does this regex pattern fail to match the groups in Java. When I run the same example with in a bash shell with echo
and sed
it works.
String s = "Match foo and bar and baz";
//Pattern p = Pattern.compile("Match (.*) or (.*) or (.*)"); //was a typo
Pattern p = Pattern.compile("Match (.*) and (.*) and (.*)");
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println(m.group(1));
}
I am expecting to match foo
, bar
, and baz
.
$ echo "Match foo and bar and baz" | sed 's/Match \(.*\) and \(.*\) and \(.*\)/\1, \2, \3/'
foo, bar, baz
It is due to greedy nature of .*
. You can use this regex:
Pattern p = Pattern.compile("Match (\\S+) and (\\S+) and (\\S+)");
Here this regex is using \\\\S+
which means match 1 or more non-spaces.
Full code
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println(m.group(1) + ", " + m.group(2) + ", " + m.group(3));
}
You're trying to match the whole String
, so
while (m.find()) {
will only iterate once.
That single find()
will capture all the groups. As such, you can print them out as
System.out.println(m.group(1) + " " + m.group(2) + m.group(3));
Or use a for
loop over the Matcher#groupCount()
.
Your regex is correct, but you need to print the different groups and not only the 1st , ex:
while (m.find()) {
System.out.println(m.group(1));
System.out.println(m.group(2));
System.out.println(m.group(3));
}
It seems like a simple typo ( or
-> and
):
Pattern p = Pattern.compile("Match (.*) and (.*) and (.*)");
UPDATE
To replace:
String s = "Match foo and bar and baz";
String replaced = s.replaceAll("Match (.*) and (.*) and (.*)", "$1, $2, $3");
System.out.println(replaced);
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.