[英]Regex Evaluating Matches Incorrectly
I am having trouble getting a Java flavored Regular expression to evaluate a match correctly. 我在获取Java风格的正则表达式以正确评估匹配项方面遇到麻烦。 I define the following regular expressions:
我定义以下正则表达式:
//Any digit
static String NUM = "[0-9]";
//Exponent with only 3 digits specified
static String EXPONENT = "([Ee][+-]?" + NUM + "(" + NUM + "(" + NUM + ")?)?)";
static String NUMBER = "([+-]?((" + NUM + NUM + "*.?" + NUM + "*)|(." + NUM
+ NUM + "*))" + EXPONENT + "?)";
static String S_COMMA_S = "(( )*,( )*)";
static String NUM_DATA = "(" + NUMBER + "(" + S_COMMA_S + NUMBER + ")*)";
With how NUM_DATA is defined a possible match would be "123, 456" As far as my understanding goes, any list of numbers ending with a number and not a comma should be valid. 根据NUM_DATA的定义方式,可能的匹配为“ 123,456”。据我所知,以数字而不是逗号结尾的任何数字列表都应该有效。 However, according to the following test method, it matches a number list ending in a comma
但是,根据以下测试方法,它匹配以逗号结尾的数字列表
public static void main(String[] args) {
System.out.println(NUM_DATA);
String s = "123";
System.out.println(s.matches(NUM_DATA));
s = "123, 456";
System.out.println(s.matches(NUM_DATA));
s = "123, 456,";//HANGING COMMA, SHOULD NOT MATCH
System.out.println(s.matches(NUM_DATA));
}
Which results in the following output: 结果如下:
(([+-]?(([0-9][0-9]*.?[0-9]*)|(.[0-9][0-9]*))([Ee][+-]?[0-9]([0-9]([0-9])?)?)?)((( )*,( )*)([+-]?(([0-9][0-9]*.?[0-9]*)|(.[0-9][0-9]*))([Ee][+-]?[0-9]([0-9]([0-9])?)?)?))*)
true
true
true
Where are my assumptions going wrong? 我的假设错在哪里? Or is this behavior incorrect?
还是这种行为不正确?
EDIT: I suppose I should post the behavior I am expecting 编辑:我想我应该发布我期望的行为
Matches: (Any list of comma separated numbers, including one number)
1.222
1.222, 324.4
2.51e123, 3e2
-.123e-12, 32.1231, 1e1, .111, -1e-1
Non-Matches:
123.321,
,
, 123.321
In your NUMBER regex you have a .
在NUMBER正则表达式中,您有个
.
which matches any character, also a comma in the end, you need to escape it \\.
匹配任何字符,最后也是逗号,您需要将其转义
\\.
, but in Java Strings \\
has to be escaped, so it is "\\\\."
,但在Java字符串
\\
中必须转义,因此为"\\\\."
in a String. 在一个字符串中。
Your regex can be refactored to a shorter: 您的正则表达式可以重构为较短的形式:
^([+-]?(?:\.\d+|\d+(?:\.\d+)?)(?:[Ee][+-]?\d+)?)(?: *, *([+-]?(?:\.\d+|\d+(?:\.\d+)?)(?:[Ee][+-]?\d+)?))*$
This will still meet your requirements as you can see in this: 如您所见,这仍将满足您的要求:
You will get all your numbers in matched groups. 您将在匹配组中获得所有号码。
I recommend you to use this regex with Pattern
and Matcher
API to avoid compiling this long regex again & again in String#matches
. 我建议您将此正则表达式与
Pattern
和Matcher
API结合使用,以避免再次在String#matches
再次编译此长正则表达式。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.