简体   繁体   中英

Tokens stick when splitting a String in Java?

I'm trying to split a test String, "1 + 2 = 3 += 4 + --5" into its components without relying on spaces. I want the end result to be { 1, +, 2, =, 3, +=, 4, +, --, 5 } however some tokens seem to stick together. I wrote the following Regex to split the String:

"(?<=(\\.)|(\\w))\\s*(?=[O])|(?<=[O])\\s*(?=(\\.)|(\\w))"

and then used the ReplaceAll function to replace "O" with the following, which are my operators that I want to split on:

"(\\\\+)|(\\\\=)|(\\\\+=)|(\\\\-)"

However when applying this regex to splitting the String I provided as an example, I get the following result: { 1, +, 2, =, 3, +=, 4, +--, 5 }. Why do the minuses stick to the plus in the 2nd to last token? Is there anyway to fix this and make the split tokens appear as { 1, +, 2, =, 3, +=, 4, +, --, 5 }?

You could do matching instead of splitting.

String a = "1 + 2 = 3 += 4 +--5";
Matcher m = Pattern.compile("\\d+|[^\\w\\s]+").matcher(a);
ArrayList<String> list = new ArrayList<String>();
while (m.find()) {
    list.add(m.group());
}
System.out.println(list);

Output:

[1, +, 2, =, 3, +=, 4, +--, 5]

Try this:

String input = "1 + 2 = 3 += 4 + --5";
//StringTokenizer stringTokenizer = new StringTokenizer(input, " ");
StringTokenizer stringTokenizer = new StringTokenizer(input, "1234567890", true);

StringBuilder builder = new StringBuilder("[");

while (stringTokenizer.hasMoreElements()) {
  //builder.append(stringTokenizer.nextElement());
  builder.append(stringTokenizer.nextElement().toString().trim());
  builder.append(stringTokenizer.hasMoreTokens() ? "," : "]");
}
System.out.printf("Using the java.util.StringTokenizer: %s%n", builder);

OUTPUT:

Using the java.util.StringTokenizer: [1, +, 2, =, 3, +=, 4, +, --5]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM