简体   繁体   中英

Java Splitting With Math Expression

I am trying to split a Math Expression.

String number = "100+500";

String[] split = new String[3];

I want to make

  • split[0] = "100"
  • split[1] = "+"
  • split[2] = "500"

I tried this but I don't know what to write for splitting.

split = number.split(????);

You want to split between digits and non-digits without consuming any input... you need look arounds:

String[] split = number.split("(?<=\\d)(?=\\D)|(?<=\\D)(?=\\d)");

What the heck is that train wreck of a regex?

It's expressing the initial sentence of this answer:

  • (?<=\\d) means the previous character is a digit
  • (?=\\D) means the next character is a non-digit
  • (?<=\\d)(?=\\D) together will match between a digit and a non-digit
  • regexA|regexB means either regexA or regexB is matched, which is used as above points, but non-digit then digit for the visa-versa logic

An important point is that look arounds are non-consuming , so the split doesn't gobble up any of the input during the split.


Here's some test code:

String number = "100+500-123/456*789";
String[] split = number.split("(?<=\\d)(?=\\D)|(?<=\\D)(?=\\d)");
System.out.println(Arrays.toString(split));

Output:

[100, +, 500, -, 123, /, 456, *, 789]

To work with numbers that may have a decimal point, use this regex:

"(?<=[\\d.])(?=[^\\d.])|(?<=[^\\d.])(?=[\\d.])"

which effectively just add . to the characters that are a "number".

Off the bat, I don't know any library routine for the split. A custom splitting routine could be like this:

/**
 * Splits the given {@link String} at the operators +, -, * and /
 * 
 * @param string
 *            the {@link String} to be split.
 * @throws NullPointerException
 *             when the given {@link String} is null.
 * @return a {@link List} containing the split string and the operators.
 */
public List<String> split(String string) throws NullPointerException {
    if (string == null)
        throw new NullPointerException("the given string is null!");
    List<String> result = new ArrayList<String>();

    // operators to split upon
    String[] operators = new String[] { "+", "-", "*", "/" };

    int index = 0;
    while (index < string.length()) {
        // find the index of the nearest operator
        int minimum = string.length();
        for (String operator : operators) {
            int i = string.indexOf(operator, index);
            if (i > -1)
                minimum = Math.min(minimum, i);
        }

        // if an operator is found, split the string
        if (minimum < string.length()) {
            result.add(string.substring(index, minimum));
            result.add("" + string.charAt(minimum));
            index = minimum + 1;
        } else {
            result.add(string.substring(index));
            break;
        }
    }

    return result;
}

Some test code:

System.out.println(split("100+10*6+3"));
System.out.println(split("100+"));

Output:

[100, +, 10, *, 6, +, 3]
[100, +]

You can also use the Pattern/Matcher classes in Java:

    String expression = "100+34";
    Pattern p = Pattern.compile("(\\d+)|(\\+)");
    Matcher m = p.matcher(expression);
    String[] elems = new String[m.groupCount() +1];
    int i=0;

    while(m.find())
    {
        elems[i++] = m.group();
    }

You can do something simple instead of insane regex; just pad + with white space:

String number = "100+500";
number = number.replace("+", " + ");

Now you can split it at the white space:

String[] split = number.split(" ");

Now your indices will be set:

split[0] = "100";
split[1] = "+";
split[2] = "500";

To check for all arithmetic symbols, you can use the following method if you wish to avoid regex:

public static String replacing(String s) {
   String[] chars = {"+", "-", "/", "="};

   for (String character : chars) {
      if (s.contains(character)) {
         s = s.replace(character, " " + character + " ");//not exactly elegant, but it works
      }
   }
   return s;
}

//in main method
number = replacing(number);
String[] split = number.split(" ");

You can split your expression string, then in result having pure tokens and categorized tokens. The mXparser library supports this as well as the calculation process. Please follow the below example:

Your very simple example "100+500":

import org.mariuszgromada.math.mxparser.*;
...
...
Expression e = new Expression("100+500");
mXparser.consolePrintTokens( e.getCopyOfInitialTokens() );

Result:

[mXparser-v.4.0.0]  --------------------
[mXparser-v.4.0.0] | Expression tokens: |
[mXparser-v.4.0.0]  ---------------------------------------------------------------------------------------------------------------
[mXparser-v.4.0.0] |    TokenIdx |       Token |        KeyW |     TokenId | TokenTypeId |  TokenLevel |  TokenValue |   LooksLike |
[mXparser-v.4.0.0]  ---------------------------------------------------------------------------------------------------------------
[mXparser-v.4.0.0] |           0 |         100 |       _num_ |           1 |           0 |           0 |       100.0 |             |
[mXparser-v.4.0.0] |           1 |           + |           + |           1 |           1 |           0 |         NaN |             |
[mXparser-v.4.0.0] |           2 |         500 |       _num_ |           1 |           0 |           0 |       500.0 |             |
[mXparser-v.4.0.0]  ---------------------------------------------------------------------------------------------------------------

More sophisticated example "2*sin(x)+(3/cos(y)-e^(sin(x)+y))+10":

import org.mariuszgromada.math.mxparser.*;
...
...
Argument x = new Argument("x");
Argument y = new Argument("y");
Expression e = new Expression("2*sin(x)+(3/cos(y)-e^(sin(x)+y))+10", x, y);
mXparser.consolePrintTokens( e.getCopyOfInitialTokens() );

Result:

[mXparser-v.4.0.0]  --------------------
[mXparser-v.4.0.0] | Expression tokens: |
[mXparser-v.4.0.0]  ---------------------------------------------------------------------------------------------------------------
[mXparser-v.4.0.0] |    TokenIdx |       Token |        KeyW |     TokenId | TokenTypeId |  TokenLevel |  TokenValue |   LooksLike |
[mXparser-v.4.0.0]  ---------------------------------------------------------------------------------------------------------------
[mXparser-v.4.0.0] |           0 |           2 |       _num_ |           1 |           0 |           0 |         2.0 |             |
[mXparser-v.4.0.0] |           1 |           * |           * |           3 |           1 |           0 |         NaN |             |
[mXparser-v.4.0.0] |           2 |         sin |         sin |           1 |           4 |           1 |         NaN |             |
[mXparser-v.4.0.0] |           3 |           ( |           ( |           1 |          20 |           2 |         NaN |             |
[mXparser-v.4.0.0] |           4 |           x |           x |           0 |         101 |           2 |         NaN |             |
[mXparser-v.4.0.0] |           5 |           ) |           ) |           2 |          20 |           2 |         NaN |             |
[mXparser-v.4.0.0] |           6 |           + |           + |           1 |           1 |           0 |         NaN |             |
[mXparser-v.4.0.0] |           7 |           ( |           ( |           1 |          20 |           1 |         NaN |             |
[mXparser-v.4.0.0] |           8 |           3 |       _num_ |           1 |           0 |           1 |         3.0 |             |
[mXparser-v.4.0.0] |           9 |           / |           / |           4 |           1 |           1 |         NaN |             |
[mXparser-v.4.0.0] |          10 |         cos |         cos |           2 |           4 |           2 |         NaN |             |
[mXparser-v.4.0.0] |          11 |           ( |           ( |           1 |          20 |           3 |         NaN |             |
[mXparser-v.4.0.0] |          12 |           y |           y |           1 |         101 |           3 |         NaN |             |
[mXparser-v.4.0.0] |          13 |           ) |           ) |           2 |          20 |           3 |         NaN |             |
[mXparser-v.4.0.0] |          14 |           - |           - |           2 |           1 |           1 |         NaN |             |
[mXparser-v.4.0.0] |          15 |           e |           e |           2 |           9 |           1 |         NaN |             |
[mXparser-v.4.0.0] |          16 |           ^ |           ^ |           5 |           1 |           1 |         NaN |             |
[mXparser-v.4.0.0] |          17 |           ( |           ( |           1 |          20 |           2 |         NaN |             |
[mXparser-v.4.0.0] |          18 |         sin |         sin |           1 |           4 |           3 |         NaN |             |
[mXparser-v.4.0.0] |          19 |           ( |           ( |           1 |          20 |           4 |         NaN |             |
[mXparser-v.4.0.0] |          20 |           x |           x |           0 |         101 |           4 |         NaN |             |
[mXparser-v.4.0.0] |          21 |           ) |           ) |           2 |          20 |           4 |         NaN |             |
[mXparser-v.4.0.0] |          22 |           + |           + |           1 |           1 |           2 |         NaN |             |
[mXparser-v.4.0.0] |          23 |           y |           y |           1 |         101 |           2 |         NaN |             |
[mXparser-v.4.0.0] |          24 |           ) |           ) |           2 |          20 |           2 |         NaN |             |
[mXparser-v.4.0.0] |          25 |           ) |           ) |           2 |          20 |           1 |         NaN |             |
[mXparser-v.4.0.0] |          26 |           + |           + |           1 |           1 |           0 |         NaN |             |
[mXparser-v.4.0.0] |          27 |          10 |       _num_ |           1 |           0 |           0 |        10.0 |             |
[mXparser-v.4.0.0]  ---------------------------------------------------------------------------------------------------------------

To understand what Token.tokenId and Token.tokenTypeId means you need to refer to the API documentation and parsertokens section. For instance in Operator class you have

  1. Operator.TYPE_ID - this corresponds to Token.tokenTypeId if Token is recognized as Operator
  2. Operator.OPERATOR_NAME_ID - this corresponds to Token.tokenId if Token is recognized as particular OPERATOR_NAME.

Please follow mXparser tutorial for better understanding.

Best regards

Since +,-,* basically all mathematically symbols are special characters so you put a "\\\\" before them inside the split function like this

String number = "100+500";
String[] numbers = number.split("\\+");
for (String n:numbers) {
  System.out.println(n);
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM