简体   繁体   中英

Group Matching Regex fails in Java

Why does this regex pattern fail to match the groups in Java. When I run the same example with in a bash shell with echo and sed it works.

String s = "Match foo and bar and baz";
//Pattern p = Pattern.compile("Match (.*) or (.*) or (.*)"); //was a typo
Pattern p = Pattern.compile("Match (.*) and (.*) and (.*)");
Matcher m = p.matcher(s);
while (m.find()) {
    System.out.println(m.group(1));
}

I am expecting to match foo , bar , and baz .

$ echo "Match foo and bar and baz" | sed 's/Match \(.*\) and \(.*\) and \(.*\)/\1, \2, \3/'
foo, bar, baz

It is due to greedy nature of .* . You can use this regex:

Pattern p = Pattern.compile("Match (\\S+) and (\\S+) and (\\S+)");

Here this regex is using \\\\S+ which means match 1 or more non-spaces.

Full code

Matcher m = p.matcher(s);
while (m.find()) {
    System.out.println(m.group(1) + ", " + m.group(2) + ", " + m.group(3));
}

You're trying to match the whole String , so

while (m.find()) {

will only iterate once.

That single find() will capture all the groups. As such, you can print them out as

System.out.println(m.group(1) + " " + m.group(2) + m.group(3));

Or use a for loop over the Matcher#groupCount() .

Your regex is correct, but you need to print the different groups and not only the 1st , ex:

while (m.find()) {
    System.out.println(m.group(1));
    System.out.println(m.group(2));
    System.out.println(m.group(3));
}

It seems like a simple typo ( or -> and ):

Pattern p = Pattern.compile("Match (.*) and (.*) and (.*)");

UPDATE

To replace:

String s = "Match foo and bar and baz";
String replaced = s.replaceAll("Match (.*) and (.*) and (.*)", "$1, $2, $3");
System.out.println(replaced);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM