简体   繁体   English

正则表达式用于匹配数学表达式

[英]RegEx for matching mathematical expressions

I am currently tasked with verifying if a given expression is correctly formatted. 我目前负责验证给定表达式的格式是否正确。 Ex. 防爆。 "7x^2 + 2x +1" would pass but "7x72" or something malformed would fail. “ 7x ^ 2 + 2x +1”将通过,但“ 7x72”或格式错误的内容将失败。

Attempt 尝试

\d*x{0,1}(\^\d*){0,1}

It checks for the existence of each piece of the function, coefficient, optional x , optional exponent. 它检查每个函数是否存在,系数,可选x,可选指数。 I'm just not sure how to actually format it so that it has to be correct for every part between +/- signs , otherwise the function isn't keyed in correctly. 我只是不确定如何格式化它,以便它对于+/-号之间的每个部分都必须正确,否则该功能将无法正确键入。

How do I solve this problem? 我该如何解决这个问题?

TL;DR Regex is: TL; DR正则表达式是:

(?!$)(?:(-?\d*)x\^2)?(?:(?<=2)\s*([+-])\s*(?!-|^))?(?:(?<!2)(-?\d*)x)?(?:(?<=[2x])\s*([+-])\s*(?!-|^))?(?:(?<![2x])(-?\d+))?

The full syntax would be: 完整的语法为:

[ ['-'][number]'x^2' ] ['+' | '-'] [ ['-'][number]'x' ] ['+' | '-'] [ ['-']number ]

['-'] means an optional minus sign for the number. ['-']表示数字的可选减号。

[number] is because a multiplier of 1 can be omitted, even if it is -1 , eg -x is a valid shorthand for -1x . [number]是因为即使它是-1 ,也可以省略乘数1 ,例如-x-1x的有效简写。

['+' | '-'] ['+' | '-'] means a + or a - , marked optional to keep it simple, but it is actually required between parts, and not allowed by itself. ['+' | '-']表示为+- ,标记为可选以使其简单,但实际上在各部分之间是必需的,而其本身是不允许的。 That is the tricky part that will make the regex grown in size. 这是使正则表达式变大的棘手部分。

So, let's build it up: 因此,让我们进行构建:

part1: -?\d*x\^2
part2: -?\d*x
part3: -?\d+
OP   : \s*[+-]\s*(?!-)

The (?!-) is to prevent a minus sign following an operator. (?!-)是为了防止在运算符后面出现减号。

What is left is the [part1] [OP] [part2] [OP] [part3] with 3 optional parts and two optional operators, except that parts must be separated by operators and operators must be between parts, and at least one part must be present. 剩下的就是[part1] [OP] [part2] [OP] [part3]其中包含3个可选部件和两个可选运算符,除了必须由运算符分隔的部分,并且运算符必须在两个部件之间,并且至少一个部件必须出席。

To ensure something is present, we just need to prevent empty string, ie start regex with (?!$) , a zero-with negative lookahead for end-of-string, aka "at start of input, we're not also at end of input". 为了确保存在某些内容,我们只需要防止空字符串,即以(?!$)开头正则表达式,即字符串结尾处的零负负超前,也就是“在输入开始时,我们不在输入结束”。

In regex we do optional using (?: xxx )? 在正则表达式中,我们使用(?: xxx )? , ie a non-capturing group marked optional, so making each part and operator optional is easy. ,即标记为可选的非捕获组,因此使每个零件和操作员可选是容易的。

So far we therefore have: 因此,到目前为止,我们有:

(?!$)(?:part1)?(?:OP)?(?:part2)?(?:OP)?(?:part3)?

Now for the tricky parts: 现在介绍棘手的部分:

  • Parts must be separated by operators 零件必须由操作员分开

    This is easiest tested by ensure that part2 is not immediately preceded by a 2 (last character of part1), and that part3 is not immediately preceded by a 2 or an x (last character of part1 or part2), so we'll use zero-width negative lookbehinds. 通过确保part2不紧随其后是2 (part1的最后一个字符),并且part3不紧随其后是2x (part1或part2的最后一个字符),这是最简单的测试,因此我们将使用零宽度负向后看。

     (?:(?<!2)part2) (?:(?<![2x])part3) 
  • Operators must be between parts 操作员必须在零件之间

    Operators cannot be first or last 运算符不能是第一个或最后一个

     (?:(?<!^)OP(?!^)) 

    Operators cannot be adjacent, easiest tested by ensuring that second operator is immediately preceded by part1 or part2, which makes the "not at start" redundant. 操作员不能相邻,可以通过确保第二个操作员前面紧跟着part1或part2来进行测试,这是最简单的测试,这使“不在开始时”变得多余。

     (?:(?<=[2x])OP) 

    For consistency, the "not at start" for first operator can be changed to "must be part1". 为了保持一致,可以将第一个操作员的“不是开始”更改为“必须是part1”。

     (?:(?<=2)OP) 

That now gives us (shown on multiple lines to clarify): 现在可以给我们(多行显示以澄清):

(?!$)
(?:part1)?
(?:(?<=2)OP(?!^))?
(?:(?<!2)part2)?
(?:(?<=[2x])OP(?!^))?
(?:(?<![2x])part3)?

All combined, with added capture groups to capture the (signed) numbers and the operators 全部组合,并添加了捕获组以捕获(带符号的)数字和运算符

(?!$)(?:(-?\d*)x\^2)?(?:(?<=2)\s*([+-])\s*(?!-|^))?(?:(?<!2)(-?\d*)x)?(?:(?<=[2x])\s*([+-])\s*(?!-|^))?(?:(?<![2x])(-?\d+))?

Here it is, coded in Java, with logic for finding a , b , and c in the formula ax^2 + bx + c : 它是用Java编码的,具有在公式ax^2 + bx + c查找abc逻辑:

public static void main(String[] args) {
    System.out.println("part1 op1   part2 op2   part3   a  b  c  input");
    test("3x^2 + 4x + 5");
    test("x^2 + x + 1");
    test("x^2 - 4x");
    test("-x^2 - 1");
    test("-4x - 5");
    test("-3x^2");
    test("-4x");
    test("-5");
    test("");
    test("-3x^2 + -1x");
}
private static void test(String input) {
    String regex = "(?!$)" +
                   "(?:(-?\\d*)x\\^2)?" +
                   "(?:(?<=2)\\s*([+-])\\s*(?!-|^))?" +
                   "(?:(?<!2)(-?\\d*)x)?" +
                   "(?:(?<=[2x])\\s*([+-])\\s*(?!-|^))?" +
                   "(?:(?<![2x])(-?\\d+))?";
    Matcher m = Pattern.compile(regex).matcher(input);
    if (! m.matches()) {
        System.out.printf("%-41s\"%s\"%n", "No match", input);
    } else {
        String part1 = m.group(1);
        String op1   = m.group(2);
        String part2 = m.group(3);
        String op2   = m.group(4);
        String part3 = m.group(5);
        long a = parse(null, part1);
        long b = parse(op1, part2);
        long c = parse((op2 != null ? op2 : op1), part3);
        System.out.printf("%-6s%-6s%-6s%-6s%-6s%3d%3d%3d  \"%s\"%n",
                          (part1 == null ? "" : '"' + part1 + '"'),
                          (op1   == null ? "" : '"' + op1   + '"'),
                          (part2 == null ? "" : '"' + part2 + '"'),
                          (op2   == null ? "" : '"' + op2   + '"'),
                          (part3 == null ? "" : '"' + part3 + '"'),
                          a, b, c, input);
    }
}
private static long parse(String operator, String signedNumber) {
    long number;
    if (signedNumber == null)
        number = 0;
    else if (signedNumber.isEmpty())
        number = 1;
    else if (signedNumber.equals("-"))
        number = -1;
    else
        number = Long.parseLong(signedNumber);
    if ("-".equals(operator))
        number = -number;
    return number;
}

Output 产量

part1 op1   part2 op2   part3   a  b  c  input
"3"   "+"   "4"   "+"   "5"     3  4  5  "3x^2 + 4x + 5"
""    "+"   ""    "+"   "1"     1  1  1  "x^2 + x + 1"
""    "-"   "4"                 1 -4  0  "x^2 - 4x"
"-"   "-"               "1"    -1  0 -1  "-x^2 - 1"
            "-4"  "-"   "5"     0 -4 -5  "-4x - 5"
"-3"                           -3  0  0  "-3x^2"
            "-4"                0 -4  0  "-4x"
                        "-5"    0  0 -5  "-5"
No match                                 ""
No match                                 "-3x^2 + -1x"

Try this: 尝试这个:

((-?\d+)x\^2\s*[+-]\s*)?(\d+)x\s*([+-]\s*\d+)
  • (-?\\d+)x\\^2\\s*[+-]\\s*)?
    • -?\\d+ - 0 or 1 - sign followed by one or more digit -?\\d+ - 0或1 -符号之后是一个或多个数字
    • x\\^2 - x character, ^ character, 2 character x\\^2 -x字符,^字符,2个字符
    • \\s* - optional space \\s* -可选空间
    • [+-] - either + character or - character [+-] -+字符或-字符
    • ? - 0 or 1 of this group. -这个群组中的0或1。 Quadratic formula may not have first part. 二次公式可能没有第一部分。 Remove this ? 删除这个? to make it not optional 使它不是可选的
  • (\\d+)x - one or more digit followed by x character (ex: 2x) (\\d+)x一个或多个数字,后跟x个字符(例如:2x)
  • \\s* - optional space \\s* -可选空间
  • ([+-]\\s*\\d+)
    • [+-] - either + character or - character [+-]-+字符或-字符
    • \\s* - optional space \\s* -可选空间
    • \\d+ - one or more digit (ex: 3) \\d+ -一个或多个数字(例如3)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM