[英]How to match entire string to be one of two formats with a single regular expression?
I need to validate values that can have one of two formats and am trying to do so with a single regular expression but can't figure out why it doesn't work. 我需要验证可以具有两种格式之一的值,并且尝试使用单个正则表达式执行此操作,但无法弄清楚它为什么不起作用。
The first format is exactly 17 alphanumeric characters and the expression ^[A-Za-z0-9]{17}$
correctly matches the test value 5UXWX7C56BA123456
but not the shortened value 5UXWX7C56BA12345
or the lengthened value 5UXWX7C56BA1234569
. 第一种格式恰好是17个字母数字字符,表达式
^[A-Za-z0-9]{17}$
正确匹配测试值5UXWX7C56BA123456
但不是缩短值5UXWX7C56BA12345
或加长值5UXWX7C56BA1234569
。
The second format is exactly 8 alphanumeric characters followed by asterisk or underscore ansd two more alphanumeric characters. 第二种格式恰好是8个字母数字字符,后跟星号或下划线,另外还有两个字母数字字符。 The expression
^[A-Za-z0-9]{8}[*_][A-Za-z0-9]{2}$
correctly matches the test value 5UXWX7C5*BA
but not the shortened value 5UXWX7C5*B
or the lengthened value 5UXWX7C5*BA1
. 表达式
^[A-Za-z0-9]{8}[*_][A-Za-z0-9]{2}$
正确匹配测试值5UXWX7C5*BA
但不是缩短值5UXWX7C5*B
或者加长值5UXWX7C5*BA1
。
However when I try to combine the expressions I get unexpected results that differ, depending on which of the sub-expressions I place first. 但是,当我尝试组合表达式时,我会得到不同的意外结果,具体取决于我首先放置的子表达式。 The following snippet of code demonstrates
以下代码片段演示了
var pattern1 = new Regex(@"^([A-Za-z0-9]{17})|([A-Za-z0-9]{8}[*_][A-Za-z0-9]{2})$");
var pattern2 = new Regex(@"^([A-Za-z0-9]{8}[*_][A-Za-z0-9]{2})|([A-Za-z0-9]{17})$");
var values = new string[]
{
"5UXWX7C56BA12345", "5UXWX7C56BA123456", "5UXWX7C56BA1234569",
"5UXWX7C5*B", "5UXWX7C5*BA", "5UXWX7C5*BA1"
};
Console.WriteLine($"Using {pattern1}\n");
Console.WriteLine($" {"Value",-20}{"IsMatch",-9}{"Expected",-10}");
Console.WriteLine($" {new string('-', 37)}");
values
.Select(x => new { Value = x, Result = pattern1.IsMatch(x), ExpectedResult = x.Length == 11 || x.Length == 17 })
.Select(x => $" {x.Value,-20}{x.Result,-9}{x.ExpectedResult} {(x.Result == x.ExpectedResult ? "" : "UNEXPECTED")}")
.WithEach(Console.WriteLine);
Console.WriteLine($"\n\nUsing {pattern2}\n");
Console.WriteLine($" {"Value",-20}{"IsMatch",-9}{"Expected",-10}");
Console.WriteLine($" {new string('-', 37)}");
values
.Select(x => new { Value = x, Result = pattern2.IsMatch(x), ExpectedResult = x.Length == 11 || x.Length == 17 })
.Select(x => $" {x.Value,-20}{x.Result,-9}{x.ExpectedResult} {(x.Result == x.ExpectedResult ? "" : "UNEXPECTED")}")
.WithEach(Console.WriteLine);
producing the following results 产生以下结果
Using ^([A-Za-z0-9]{17})|([A-Za-z0-9]{8}[*_][A-Za-z0-9]{2})$
Value IsMatch Expected
-------------------------------------
5UXWX7C56BA12345 False False
5UXWX7C56BA123456 True True
5UXWX7C56BA1234569 True False UNEXPECTED
5UXWX7C5*B False False
5UXWX7C5*BA True True
5UXWX7C5*BA1 False False
Using ^([A-Za-z0-9]{8}[*_][A-Za-z0-9]{2})|([A-Za-z0-9]{17})$
Value IsMatch Expected
-------------------------------------
5UXWX7C56BA12345 False False
5UXWX7C56BA123456 True True
5UXWX7C56BA1234569 True False UNEXPECTED
5UXWX7C5*B False False
5UXWX7C5*BA True True
5UXWX7C5*BA1 True False UNEXPECTED
I hope someone will be able to point out the error in my expressions. 我希望有人能够在我的表达中指出错误。 It seems that although I am using ^ and $ to try and force the entire line/value to be matched, that somehow when longer a match is found even though there is a further unmatched character that I would have expected to cause the entire value not to match.
似乎虽然我正在使用^和$来尝试强制匹配整个行/值,但是当某个匹配被发现更长时,即使存在进一步不匹配的字符,我本来希望它会导致整个值不是匹配。
Although I used LINQPad to run the snippet above I see the same results from regex101.com . 虽然我使用LINQPad来运行上面的代码片段,但我看到了与regex101.com相同的结果。
Your regexps are not anchored correctly: 您的正则表达式未正确锚定:
^([A-Za-z0-9]{17})|([A-Za-z0-9]{8}[*_][A-Za-z0-9]{2})$
^ ^ ^ ^
Here, ([A-Za-z0-9]{17})
is only anchored at the start of the string (and there can be anything after that pattern) and ([A-Za-z0-9]{8}[*_][A-Za-z0-9]{2})
is only anchored at the end of the string (and there can be anything before that pattern). 这里,
([A-Za-z0-9]{17})
仅锚定在字符串的开头(并且在该模式之后可以有任何内容)和([A-Za-z0-9]{8}[*_][A-Za-z0-9]{2})
仅锚定在字符串的末尾(并且在该模式之前可以有任何内容)。
The same problem is with the second pattern, you just swapped the alternatives. 同样的问题是第二种模式,你只是换了替代品。
Use 使用
var pattern1 = new Regex(@"^(?:[A-Za-z0-9]{17}|[A-Za-z0-9]{8}[*_][A-Za-z0-9]{2})$");
^ ^ ^
Otherwise, your alternatives are not anchored on both sides. 否则,你的选择是不固定在两侧 。
See the regex demo . 请参阅正则表达式演示 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.