简体   繁体   中英

regex (lookahead) with multiple digits

I have a String which looks like:

text = "9) text of 9\\r\\n10) text of 10\\r\\n11) text of 11\\r\\n12) ...\\r\\n123) text of 123"

I am trying to split it up as follows:

String[] result = text.split("(?=\\\\d+\\\\))");

The result I am looking for is:

- result[0] = "9) text of 9"
- result[1] = "10) text of 10"
- result[2] = "11) text of 11"
- ...

But it is not working. Which regex should I use in conjunction with text.split() ?

I think you were very close - did you try adding the delimiter prior to the lookahead?

String[] result = text.split("\\r\\n(?=\d+\))");

I tried it in the JS console (the JS regex is pretty similar to the Java regex processor)

let x= "9) text of 9 10) text of 10 11) text of 11 ... 123) text of 123"
let result = x.split(/\\r\\n(?=\d+\))/);

result then gives the array you wanted

Update : Updated code answer based on updated question

The regex "(\\\\d+)\\\\) text of (\\\\d+)" will do the trick, like so:

String s = "9) text of 9 10) text of 10 11) text of 11 ... 123) text of 123";
Pattern p = Pattern.compile("(\\d+)\\) text of (\\d+)");
Matcher m = p.matcher(s);
boolean matchesFound = m.find();
System.out.println("found matches: "+matchesFound);
m.results().map(MatchResult::group).forEach(System.out::println);

The output will be:

found matches: true
10) text of 10
11) text of 11
123) text of 123

If you want to put the results in a list/array instead, just do the following:

String s = "9) text of 9 10) text of 10 11) text of 11 ... 123) text of 123";
Pattern p = Pattern.compile("(\\d+)\\) text of (\\d+)");
Matcher m = p.matcher(s);
boolean matchesFound = m.find();
System.out.println("found matches: "+matchesFound);
List<String> results = m.results().map(MatchResult::group).map(Object::toString).collect(Collectors.toList());
System.out.println(results);
String[] resultsAsArray = new String[results.size()];
results.toArray(resultsAsArray);
System.out.println(Arrays.toString(resultsAsArray));

The output here will be:

found matches: true
[10) text of 10, 11) text of 11, 123) text of 123]
[10) text of 10, 11) text of 11, 123) text of 123]

Try lazy quantifier instead of greedy quantifier. Try the below Regex.

(\d*\).*?\d+)

This small line of code will solve your problem.

    List<String> matchlist = new ArrayList();
    String text = "9) text of 9 10) text of 10 11) text of 11";
    Pattern regex = Pattern.compile("\\d+\\)[a-zA-Z\\s]*\\d+");

    Matcher result = regex.matcher(text);
    while (result.find()) {
        matchlist.add(result.group());
    }
    System.out.println(matchlist);

The matcher(ArrayList) will contain the required output as shown below:

[9) text of 9, 10) text of 10, 11) text of 11]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM