简体   繁体   中英

exactly n times - group

I want to get out numbers from a line with pattern, but it wont group numbers as I would like.

public static void main(String[] args) {
    Pattern pattern = Pattern.compile("(.*?)((\\d+),{0,1}\\s*){7}");
    Scanner in = new Scanner("text: 1, 2, 3, 4, 5, 6, 7"); // new Scanner(new File("data.txt"));
    in.useDelimiter("\n");

    try {
        while(!(in.hasNext(pattern))) {
            //Skip corrupted data
            in.nextLine();
        }
    } catch(NoSuchElementException ex) {
    }
    String line = in.next();
    Matcher m = pattern.matcher(line);
    m.matches();
    int groupCount = m.groupCount();
    for(int i = 1; i <= groupCount; i++) {
        System.out.println("group(" + i + ") = " + m.group(i));
    }
}

Output:

group(1) = text:

group(2) = 7

group(3) = 7

What I want to get is:

group(2) = 1

group(3) = 2

...

group(8) = 7

Can I get this from this one pattern or should I make another one ?

If you simply want to collect the integers, you can iterate over substrings using the Matcher.find() method using a pattern in the following style: 1) optional separator or a new line; 2) an integer possibly surrounded with whitespaces. You do not have to manage the group indexes at all because you can only refer a concrete capture group. The following solution does not need anything except of regular expressions and just iterates over a char sequence to find integers:

package stackoverflow;

import java.util.ArrayList;
import java.util.Collection;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

import static java.lang.System.out;
import static java.util.regex.Pattern.compile;

public final class Q11599271 {

    private Q11599271() {
    }

    //
    // (2) Let's capture an integer number only     -------------------+
    // (1) Let's assume it can start with a new     ------+            |
    //     line or a comma character                      |            |
    //                                              +-----+-----+    +-+--+
    //                                              |           |    |    |
    private static final Pattern pattern = compile("(?:^\\S+:|,)?\\s*(\\d+)\\s*");

    private static Iterable<String> getOut(CharSequence s) {
        final Collection<String> numbers = new ArrayList<String>();
        final Matcher matcher = pattern.matcher(s);
        while ( matcher.find() ) {
            numbers.add(matcher.group(1));
        }
        return numbers;
    }

    private static void display(Iterable<String> strings) {
        for ( final String s : strings ) {
            out.print(" ");
            out.print(s);
        }
        out.println();
    }

    public static void main(String[] args) {
        display(getOut("text: 1, 2, 3, 4, 5, 6, 7"));
        display(getOut("1, 2, 3, 4, 5, 6, 7"));
        display(getOut("text: 1,  22,  333   , 4444 , 55555 , 666666, 7777777"));
    }

}

That will produce the following:

1 2 3 4 5 6 7
1 2 3 4 5 6 7
1 22 333 4444 55555 666666 7777777

You cannot. Groups always correspond to capturing groups in the regular expression. That is, if you have one capturing group, there cannot be more than one group in the match. It is irrelevant how often a portion (even a capturing group) is repeated during the match. The expression itself defines how many groups the final match can have.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM