[英]Why does this regular expression fail in .NET?
Take a look at this test method: 看一下这个测试方法:
[ Test ]
public static void TestRegex() {
var goodTextsToTest = new List<string>
{
"http://google.com",
"https://google.com/",
"ftp://bugger!!!one1",
"ftss://shoot",
"somelongergibberish://flkjd",
"thescheme://green"
};
var badTextsToTest = new List<string> { "bad432:4h//orange", "1ftp://1bugger!!!one1", "IAmTheVeryModelOfAModernMajorGeneral", "" };
var regex = new Regex( "^([a-z][a-z0-9+\\.\\-]*)*://", RegexOptions.IgnoreCase );
foreach( var txt in badTextsToTest )
Assert.IsFalse( regex.IsMatch( txt ), "Passed but should have failed: " + txt );
foreach( var txt in goodTextsToTest )
Assert.IsTrue( regex.IsMatch( txt ), "Failed but should have passed: " + txt );
}
As it is currently written, this code never returns from var regex = new Regex( "^([az][a-z0-9+\\\\.\\\\-]*)*://", RegexOptions.IgnoreCase );
就目前而言,此代码永远不会从
var regex = new Regex( "^([az][a-z0-9+\\\\.\\\\-]*)*://", RegexOptions.IgnoreCase );
. 。 The input the code gets stuck on is "IAmTheVeryModelOfAModernMajorGeneral".
代码卡住的输入是“ IAmTheVeryModelOfAModernMajorGeneral”。
Why does this regular expression cause an infinite loop when the input is "IAmTheVeryModelOfAModernMajorGeneral" ? 当输入为“ IAmTheVeryModelOfAModernMajorGeneral”时,为什么此正则表达式会导致无限循环?
Bonus question: This code does finish executing if you remove "://" from the regular expression. 额外的问题:如果从正则表达式中删除“://”,则此代码不会完成执行。 Ie
var regex = new Regex( "^([az][a-z0-9+\\\\.\\\\-]*)*", RegexOptions.IgnoreCase );
即
var regex = new Regex( "^([az][a-z0-9+\\\\.\\\\-]*)*", RegexOptions.IgnoreCase );
Why does this fix it? 为什么可以解决此问题?
Your regex, designed as it is now will take more than 87500 steps to complete, because of backtracking. 由于回溯,现在设计的正则表达式将需要87500多个步骤来完成。 See the debbuger here .
在这里查看调试器。 This is what we call catastrophic backtracking
这就是我们所说的灾难性回溯
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.