简体   繁体   中英

Java split string with 2 delimiters

I am trying to split up some strings using String.split() .

I want it to split a String based on the characters: "++" and "+" . For example, "1 + 1" would be split into {1, +, 1} and "1 ++ 1" would be split into {1, ++, 1} . I have written this line to split up the text:

String[] temp = tokens.split("(?<=(\\++)|(\\+))|(?=(\\++)|(\\+))");

This works fine for "1 + 1" (output: { 1, +, 1 } ) however it does not work for "1 ++ 1" (output: { 1, +, +, 1 } ). I know I can just convert it into an ArrayList and find a "+" followed by a "+" and simply combine them into one token however I am very curious if it possible to do this with the split() ?

You can try to split on zero or more spaces which has

  • digit before it and operator after it like 12|+ 32 (I marked such placed with | )
  • or operator before and digit after it 12 ++|32 .

Your split can look like

split("(?<=\\d)\\s*(?=[+])|(?<=[+])\\s*(?=\\d)")

DEMO:

String[] data = {"1++1" , "1 ++1", "1+ 1"};
for (String str : data){
    for (String token : str.split("(?<=\\d)\\s*(?=[+])|(?<=[+])\\s*(?=\\d)")){
        System.out.println("token: <"+token+">");
    }
    System.out.println("--------");
}

Output (I surrounded tokens with < and > to show you that they also get rid of spaces):

token: <1>
token: <++>
token: <1>
--------
token: <1>
token: <++>
token: <1>
--------
token: <1>
token: <+>
token: <1>
--------

A greedy quantifier works for me:

String [] cases = { "1+1", "1++1"  };
for (String str: cases) {
  String out [] = str.split("(\\+)*");
  System.out.println(Arrays.asList(out));
}

generates:

[1, , 1]
[1, , 1]

If this doesn't cut it, then post more test cases.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM