简体   繁体   中英

Splitting String by column with regex

4 
1 1
1 2 1
0
1 1

This is a String I get as input, but I just need every column, starting with the second column, aka:

  • 1 (second row)
  • 2 and 1 (third row)
  • 1 (fifth row)

This String has no fixed size in how many lines there could be or how many columns (columns being split by one space).

I think this is fairly easy by using:

string.split("enter regex here");

I need every column after the first. I'm still learning with regex but I just can't seem to find a good solution. I know about "\\r?\\n" and " " for splitting but don't know how to connect both to get every column. Any help is very appreciated :)

Another String could look like this:

2
1
1 2
9 3 5
1 3
0 9 2 4
0

In that case, I would need 2, 3, 5, 3, 9, 2, 4 .

You can use the following regex:

(?<=\d )\d+

It matches any combination of digits, followed by "digit + space".

Instead of splitting on this, you should use the matcher with this regex.

Check the demo here .

First trim leading column, then split on white space:

String[] split = str.replaceAll("(?m)^\\d+\\s*", "").split("\\s");

See live demo .

The replace uses the multiline flag (?m) , which makes ^ match the start of every line , and \s matches spaces, so the first column is effectively deleted from every line, but \s also matches newlines, so lines with only one column are deleted entirely. Although the new lines are retained in lines with more than 1 column.

Because \s matches space and newline, the split splits between columns and between (first column removed) lines, yielding the desired result.

I believe this is the least code required for a solution.

You can split each line using String.lines to get a stream of the lines and then flatmap those lines after spliting at each space using Pattern.splitAsStream and skip the first column and join back together using comma as a delimeter:

String input ="4 \n"
            + "1 1\n"
            + "1 2 1\n"
            + "0\n"
            + "1 1\n";

Pattern pattern = Pattern.compile(" ");
String result   = input.lines()
                       .flatMap(line -> pattern.splitAsStream(line).skip(1))
                       .collect(Collectors.joining(", "));

System.out.println(result);

//1, 2, 1, 1
String s = "4 \n"
        + "1 1\n"
        + "1 2 1\n"
        + "0\n"
        + "1 1\n";
String result = s.replaceAll("((^|\\n)\\d|[ ])", "").replaceAll("(\\d)(?=\\d)", "$1, ");
System.out.println(result); 
//1, 2, 1, 1

You could use the following regex which first captures a number followed by a space and then captures any sequence of numbers followed either by a space or nothing. The second capturing group represents the rest of the String you're interested.

(\d+) ((\d+( |))+)

Here is an implementation:

String str = "4 \n" +
        "1 1\n" +
        "1 2 1\n" +
        "0\n" +
        "1 1";

Pattern pattern = Pattern.compile("(\\d+) ((\\d+( |))+)");
Matcher matcher = pattern.matcher(str);

while(matcher.find()){
    System.out.println(matcher.group(2));
}

Here is a link to test the code above for both inputs:

https://www.jdoodle.com/iembed/v0/s92

Output

1
2 1
1

2
3 5
3
9 2 4

Here is also a link to test the regex:

https://regex101.com/r/z1plcG/1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM