简体   繁体   中英

Regular Expression to Validate a Math Expression

I'm trying to determine if a given input is a valid Math Expression. This is the current code I've come up with, but it only ever returns true if Input is a single integer (eg 100, 200, 5, 7).

Pattern pattern = Pattern.compile("-?\\w+|[-+*%/()]");
Matcher match = pattern.matcher(Input);

if(pattern.matcher(Input).matches())
{
    System.out.print("True");
}
else
    System.out.print("False");

Further information on what I'm trying to accomplish:

For simplicity's sake assume integers only (so no variables and decimal places).
Operators are: +, -, *, /, %.
Parenthesis only (so no brackets or braces).

Examples:

Valid:

123  
1*2(3+4)%7  
3--4+5*-7  
13(12)+11-(7*15%(11-2)/4)  
(((((-99999)))))

Not Valid

1+2)  
)5--  
3+*12  
)(++**//
(50)+12)

Also, if possible, could a simple explanation on how the Regex works be included as well? I'm quite new to the topic. I understand it conceptually but have trouble implementing it in my code.

As several comments say, what you ask for is impossible with just a regex match. In fact, matching balanced parentheses is one of the classic "problems that cannot be solved by a simple regular expression". As long as your mathematical expressions can contain arbitrarily nested parentheses, you can't validate it with a regex.

However, it is possible to validate a smaller language, and we can then build that up into a validation routine for your language with a little bit of coding. The smaller language is just like your language but with one change: no parentheses allowed . Then, valid expressions in the language look like this:

INTEGER OP INTEGER OP INTEGER OP .... OP INTEGER

Another way to say that is "an INTEGER followed by zero or more OP INTEGER sequences". This can be translated into a regex, as:

Pattern simpleLang = Pattern.compile("-?\\d+([-+*%/]-?\\d+)*");

So -?\\d+ means INTEGER , and [-+*%/] means OP . Okay, now how do we use this? Well, first off let's modify it to add arbitrary spaces in there between integers, and make the pattern a static , because we're going to wrap this validation logic up in a class:

static Pattern simpleLang = Pattern.compile("\\s*-?\\d+(\\s*[-+*%/]\\s*-?\\d+)*\\s*");

(Though note that we don't allow a space between a negative sign and the number that follows it, so 3 - - 4 isn't allowed, even though 3 - -4 is allowed)

Now, to validate the full language, what we need to do is repeatedly find a chunk that's at the innermost parenthesized level (so, a chunk containing no parens itself but surrounded by a open-close paren pair), validate that the stuff inside the parens matches the simple language, and then replace that chunk (including the surrounding parens) with some integer, surrounded by spaces so that it's considered separate from the surrounding stuff. So the logic is something like this:

  • expr coming in is 11 - (7 * 15 % (11 - 2) / 4)
  • Innermost parenthesized chunk is 11 - 2
  • Does 11 - 2 match the simple language? Yes!
  • replace (11 - 2) with some integer. For example, with 1 .
  • expr is now 11 - (7 * 15 % 1 / 4)
  • Innermost parenthesized chunk is 7 * 15 % 1 / 4
  • Does 7 * 15 % 1 / 4 match the simple language? Yes!
  • replace (7 * 15 % 1 / 4) with some integer. For example, with 1 .
  • expr is now 11 - 1
  • No more parens, so ask: does expr match the simple language? Yes!

In code this works out to:

static Pattern simpleLang = Pattern.compile("\\s*-?\\d+(\\s*[-+*%/]\\s*-?\\d+)*\\s*");
static Pattern innerParen = Pattern.compile("[(]([^()]*)[)]");
public static boolean validateExpr(String expr) {
    while (expr.contains(")") || expr.contains("(")) {
        Matcher m = innerParen.matcher(expr);
        if (m.find()) {
            if (!simpleLang.matcher(m.group(1)).matches()) {
                return false;
            }
            expr = expr.substring(0,m.start()) + " 1 " + expr.substring(m.end());
        } else {
            // we have parens but not an innermost paren-free region
            // This implies mismatched parens
            return false;
        }
    }
    return simpleLang.matcher(expr).matches();
}

Note that there is one expression you called "valid" that this will not call valid: namely, the expression 13(12)+11-(7*15%(11-2)/4) . This will be considered invalid because there is no operator between 13 and 12. If you wish to allow that sort of implicit multiplication, the easiest way to do it is to add (the space character) as an allowed operator in the simple language, so change simpleLang to:

static Pattern simpleLang = Pattern.compile("\\s*-?\\d+(\\s*[-+ *%/]\\s*-?\\d+)*\\s*");

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM