简体   繁体   中英

Regular expression to validate a mathematical formula

I need to validate a string using regex to confirm whether it is following a valid format. The string can contain numbers, operators, space, dot, left parenthesis, right parenthesis, comma, these aggregate functions SUM, MAX, MIN, AVG and variables starting with letter V.

I found this regex ^[0-9+ -/()., ]+$ this checks 0-9 (numbers); '+'; '-'; ' '; '/'; '('; ')'; '.'; ','; ' '(space). But I am not able to include aggregate functions and letter V in this.

Some of the valid input strings are

  1. AVG(SUM(1, 2, 3), SUM(4, 5, 6)) * 100
  2. SUM(V1/2,(2+7),3)+(V1+V2)

Can someone please help me on this.

From the comments on the question:

Are you trying to ensure that only valid characters, aggregate functions, and variable names appear in the string or are you attempting to also check that the string is well formatted (ie there is an operand on either side of an operator, parenthesis are matched, etc...)?

- DM

@DMI am just trying to validate only for valid characters

- DevMJ


Since you're only looking to check that a formula contains digits, functions, variables, etc (and not that it is also valid for execution), you can add possibilities as alternatives in one group.

One possibility is the pattern ^(?:\d|\+|\-|\/|\*|\(|\)|\.|\,|AVG|SUM|MAX|MIN|V\d+| )*$ which matches the samples you provided.

Try it out!

Explanation:

Token Matches
^ Start of a line
(?: Start of the non-capturing group of alternatives
\d A digit (equivalent to [0-9] )
\+ The + character
\- The - character
\/ The / character
\* The * character
\( The ( character
\) The ) character
\. The . character
\, The , character
AVG The string AVG
SUM The string SUM
MAX The string MAX
MIN The string MIN
V\d+ The V character followed by one or more digits
A space
) End of the non-capturing group of alternatives
* Any of the alternatives zero or more times
$ End of a line

As mentioned in the comments, if you also want to check that the string can be executed successfully, you will need to look into defining a context-free grammar for your "language" and using a tool like ANTLR to parse strings using the grammar.

Since all you care for is the valid characters, that's indeed a job for regexes.

A simple way to filter this is just to add letters to the valid characters:

^[A-Z0-9+-/()., ]+$

You can even add az if you want to allow lowercase characters as well.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM