[英]RegEx for matching mathematical expressions
I am currently tasked with verifying if a given expression is correctly formatted. 我目前负责验证给定表达式的格式是否正确。 Ex. 防爆。 "7x^2 + 2x +1" would pass but "7x72" or something malformed would fail. “ 7x ^ 2 + 2x +1”将通过,但“ 7x72”或格式错误的内容将失败。
\d*x{0,1}(\^\d*){0,1}
It checks for the existence of each piece of the function, coefficient, optional x , optional exponent. 它检查每个函数是否存在,系数,可选x,可选指数。 I'm just not sure how to actually format it so that it has to be correct for every part between +/- signs , otherwise the function isn't keyed in correctly. 我只是不确定如何格式化它,以便它对于+/-号之间的每个部分都必须正确,否则该功能将无法正确键入。
How do I solve this problem? 我该如何解决这个问题?
TL;DR Regex is: TL; DR正则表达式是:
(?!$)(?:(-?\d*)x\^2)?(?:(?<=2)\s*([+-])\s*(?!-|^))?(?:(?<!2)(-?\d*)x)?(?:(?<=[2x])\s*([+-])\s*(?!-|^))?(?:(?<![2x])(-?\d+))?
The full syntax would be: 完整的语法为:
[ ['-'][number]'x^2' ] ['+' | '-'] [ ['-'][number]'x' ] ['+' | '-'] [ ['-']number ]
['-']
means an optional minus sign for the number. ['-']
表示数字的可选减号。
[number]
is because a multiplier of 1
can be omitted, even if it is -1
, eg -x
is a valid shorthand for -1x
. [number]
是因为即使它是-1
,也可以省略乘数1
,例如-x
是-1x
的有效简写。
['+' | '-']
['+' | '-']
means a +
or a -
, marked optional to keep it simple, but it is actually required between parts, and not allowed by itself. ['+' | '-']
表示为+
或-
,标记为可选以使其简单,但实际上在各部分之间是必需的,而其本身是不允许的。 That is the tricky part that will make the regex grown in size. 这是使正则表达式变大的棘手部分。
So, let's build it up: 因此,让我们进行构建:
part1: -?\d*x\^2
part2: -?\d*x
part3: -?\d+
OP : \s*[+-]\s*(?!-)
The (?!-)
is to prevent a minus sign following an operator. (?!-)
是为了防止在运算符后面出现减号。
What is left is the [part1] [OP] [part2] [OP] [part3]
with 3 optional parts and two optional operators, except that parts must be separated by operators and operators must be between parts, and at least one part must be present. 剩下的就是[part1] [OP] [part2] [OP] [part3]
其中包含3个可选部件和两个可选运算符,除了必须由运算符分隔的部分,并且运算符必须在两个部件之间,并且至少一个部件必须出席。
To ensure something is present, we just need to prevent empty string, ie start regex with (?!$)
, a zero-with negative lookahead for end-of-string, aka "at start of input, we're not also at end of input". 为了确保存在某些内容,我们只需要防止空字符串,即以(?!$)
开头正则表达式,即字符串结尾处的零负负超前,也就是“在输入开始时,我们不在输入结束”。
In regex we do optional using (?: xxx )?
在正则表达式中,我们使用(?: xxx )?
, ie a non-capturing group marked optional, so making each part and operator optional is easy. ,即标记为可选的非捕获组,因此使每个零件和操作员可选是容易的。
So far we therefore have: 因此,到目前为止,我们有:
(?!$)(?:part1)?(?:OP)?(?:part2)?(?:OP)?(?:part3)?
Now for the tricky parts: 现在介绍棘手的部分:
Parts must be separated by operators 零件必须由操作员分开
This is easiest tested by ensure that part2 is not immediately preceded by a 2
(last character of part1), and that part3 is not immediately preceded by a 2
or an x
(last character of part1 or part2), so we'll use zero-width negative lookbehinds. 通过确保part2不紧随其后是2
(part1的最后一个字符),并且part3不紧随其后是2
或x
(part1或part2的最后一个字符),这是最简单的测试,因此我们将使用零宽度负向后看。
(?:(?<!2)part2) (?:(?<![2x])part3)
Operators must be between parts 操作员必须在零件之间
Operators cannot be first or last 运算符不能是第一个或最后一个
(?:(?<!^)OP(?!^))
Operators cannot be adjacent, easiest tested by ensuring that second operator is immediately preceded by part1 or part2, which makes the "not at start" redundant. 操作员不能相邻,可以通过确保第二个操作员前面紧跟着part1或part2来进行测试,这是最简单的测试,这使“不在开始时”变得多余。
(?:(?<=[2x])OP)
For consistency, the "not at start" for first operator can be changed to "must be part1". 为了保持一致,可以将第一个操作员的“不是开始”更改为“必须是part1”。
(?:(?<=2)OP)
That now gives us (shown on multiple lines to clarify): 现在可以给我们(多行显示以澄清):
(?!$)
(?:part1)?
(?:(?<=2)OP(?!^))?
(?:(?<!2)part2)?
(?:(?<=[2x])OP(?!^))?
(?:(?<![2x])part3)?
All combined, with added capture groups to capture the (signed) numbers and the operators 全部组合,并添加了捕获组以捕获(带符号的)数字和运算符
(?!$)(?:(-?\d*)x\^2)?(?:(?<=2)\s*([+-])\s*(?!-|^))?(?:(?<!2)(-?\d*)x)?(?:(?<=[2x])\s*([+-])\s*(?!-|^))?(?:(?<![2x])(-?\d+))?
Here it is, coded in Java, with logic for finding a
, b
, and c
in the formula ax^2 + bx + c
: 它是用Java编码的,具有在公式ax^2 + bx + c
查找a
, b
和c
逻辑:
public static void main(String[] args) {
System.out.println("part1 op1 part2 op2 part3 a b c input");
test("3x^2 + 4x + 5");
test("x^2 + x + 1");
test("x^2 - 4x");
test("-x^2 - 1");
test("-4x - 5");
test("-3x^2");
test("-4x");
test("-5");
test("");
test("-3x^2 + -1x");
}
private static void test(String input) {
String regex = "(?!$)" +
"(?:(-?\\d*)x\\^2)?" +
"(?:(?<=2)\\s*([+-])\\s*(?!-|^))?" +
"(?:(?<!2)(-?\\d*)x)?" +
"(?:(?<=[2x])\\s*([+-])\\s*(?!-|^))?" +
"(?:(?<![2x])(-?\\d+))?";
Matcher m = Pattern.compile(regex).matcher(input);
if (! m.matches()) {
System.out.printf("%-41s\"%s\"%n", "No match", input);
} else {
String part1 = m.group(1);
String op1 = m.group(2);
String part2 = m.group(3);
String op2 = m.group(4);
String part3 = m.group(5);
long a = parse(null, part1);
long b = parse(op1, part2);
long c = parse((op2 != null ? op2 : op1), part3);
System.out.printf("%-6s%-6s%-6s%-6s%-6s%3d%3d%3d \"%s\"%n",
(part1 == null ? "" : '"' + part1 + '"'),
(op1 == null ? "" : '"' + op1 + '"'),
(part2 == null ? "" : '"' + part2 + '"'),
(op2 == null ? "" : '"' + op2 + '"'),
(part3 == null ? "" : '"' + part3 + '"'),
a, b, c, input);
}
}
private static long parse(String operator, String signedNumber) {
long number;
if (signedNumber == null)
number = 0;
else if (signedNumber.isEmpty())
number = 1;
else if (signedNumber.equals("-"))
number = -1;
else
number = Long.parseLong(signedNumber);
if ("-".equals(operator))
number = -number;
return number;
}
Output 产量
part1 op1 part2 op2 part3 a b c input
"3" "+" "4" "+" "5" 3 4 5 "3x^2 + 4x + 5"
"" "+" "" "+" "1" 1 1 1 "x^2 + x + 1"
"" "-" "4" 1 -4 0 "x^2 - 4x"
"-" "-" "1" -1 0 -1 "-x^2 - 1"
"-4" "-" "5" 0 -4 -5 "-4x - 5"
"-3" -3 0 0 "-3x^2"
"-4" 0 -4 0 "-4x"
"-5" 0 0 -5 "-5"
No match ""
No match "-3x^2 + -1x"
Try this: 尝试这个:
((-?\d+)x\^2\s*[+-]\s*)?(\d+)x\s*([+-]\s*\d+)
(-?\\d+)x\\^2\\s*[+-]\\s*)?
-?\\d+
- 0 or 1 -
sign followed by one or more digit -?\\d+
- 0或1 -
符号之后是一个或多个数字 x\\^2
- x character, ^ character, 2 character x\\^2
-x字符,^字符,2个字符 \\s*
- optional space \\s*
-可选空间 [+-]
- either + character or - character [+-]
-+字符或-字符 ?
- 0 or 1 of this group. -这个群组中的0或1。 Quadratic formula may not have first part. 二次公式可能没有第一部分。 Remove this ?
删除这个?
to make it not optional 使它不是可选的 (\\d+)x
- one or more digit followed by x character (ex: 2x) (\\d+)x
一个或多个数字,后跟x个字符(例如:2x) \\s*
- optional space \\s*
-可选空间 ([+-]\\s*\\d+)
\\s*
- optional space \\s*
-可选空间 \\d+
- one or more digit (ex: 3) \\d+
-一个或多个数字(例如3)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.