简体   繁体   中英

Getting match of Group with Asterisk?

How can I get the content for a group with an asterisk?

For example I'd like to pare a comma separated list, eg 1,2,3,4,5 .

private static final String LIST_REGEX = "^(\\d+)(,\\d+)*$";
private static final Pattern LIST_PATTERN = Pattern.compile(LIST_REGEX);

public static void main(String[] args) {
    final String list = "1,2,3,4,5";
    final Matcher matcher = LIST_PATTERN.matcher(list);
    System.out.println(matcher.matches());
    for (int i = 0, n = matcher.groupCount(); i < n; i++) {
        System.out.println(i + "\t" + matcher.group(i));
    }
}

And the output is

true
0   1,2,3,4,5
1   1

How can I get every single entry, ie 1 , 2 , 3 , ...?

I am searching for a common solution. This is only a demonstrative example.
Please imagine a more complicated regex like ^\\\\[(\\\\d+)(,\\\\d+)*\\\\]$ to match a list like [1,2,3,4,5]

You can use String.split() .

for (String segment : "1,2,3,4,5".split(","))
    System.out.println(segment);

Or you can repeatedly capture with assertion:

Pattern pattern = Pattern.compile("(\\d),?");
for (Matcher m = pattern.matcher("1,2,3,4,5");; m.find())
     m.group(1);

For your second example you added you can do a similar match.

for (String segment : "!!!!![1,2,3,4,5] //"
                          .replaceFirst("^\\D*(\\d(?:,\\d+)*)\\D*$", "$1")
                          .split(","))
    System.out.println(segment);

I made an online code demo . I hope this is what you wanted.


how can I get all the matches (zero, one or more) for a arbitary group with an asterisk (xyz)* ? [The group is repeated and I would like to get every repeated capture.]

No, you cannot. Regex Capture Groups and Back-References tells why:

The Returned Value for a Given Group is the Last One Captured

Since a capture group with a quantifier holds on to its number, what value does the engine return when you inspect the group? All engines return the last value captured. For instance, if you match the string A_B_C_D_ with ([AZ]_)+ , when you inspect the match, Group 1 will be D_ . With the exception of the .NET engine, all intermediate values are lost. In essence, Group 1 gets overwritten each time its pattern is matched.

I assume you may be looking for something like the following, this will handle both of your examples.

private static final String LIST_REGEX = "^\\[?(\\d+(?:,\\d+)*)\\]?$";
private static final Pattern LIST_PATTERN = Pattern.compile(LIST_REGEX);

public static void main(String[] args) {
    final String list = "[1,2,3,4,5]";
    final Matcher matcher = LIST_PATTERN.matcher(list);

    matcher.find(); 
    int i = 0;

    String[] vals = matcher.group(1).split(",");

    System.out.println(matcher.matches());
    System.out.println(i + "\t" + matcher.group(1));

    for (String x : vals) {
       i++;
       System.out.println(i + "\t" + x);
    }
}

Output

true
0   1,2,3,4,5
1   1
2   2
3   3
4   4
5   5

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM