简体   繁体   English

为什么此电子邮件正则表达式在Mvc上这么慢?

[英]Why is this Email regex so slow on Mvc?

I am currently building a system using Asp.net, c#, Mvc2 which uses the following regex: 我当前正在使用Asp.net,c#,Mvc2构建系统,该系统使用以下正则表达式:

^([0-9a-zA-Z]([-.\w]*[0-9a-zA-Z])*@([0-9a-zA-Z][-\w]*[0-9a-zA-Z]\.)+[a-zA-Z]{2,9})$

This is an e-mail regex that validates a 'valid' e-mail address format. 这是一个电子邮件正则表达式,用于验证“有效”电子邮件地址格式。 My code is as follows: 我的代码如下:

if (!Regex.IsMatch(model.Email, @"^([0-9a-zA-Z]([-.\w]*[0-9a-zA-Z])*@([0-9a-zA-Z][-\w]*[0-9a-zA-Z]\.)+[a-zA-Z]{2,9})$"))
                ModelState.AddModelError("Email", "The field Email is invalid.");

The Regex works fine for validating e-mails however if a particularly long string is passed to the regex and it is invalid it causes the system to keep on 'working' without ever resolving the page. 该正则表达式可以很好地用于验证电子邮件,但是,如果将一个特别长的字符串传递给正则表达式,并且该字符串无效,则它将导致系统继续“工作”,而无需解析页面。 For instance, this is the data that I tried to pass: 例如,这是我尝试传递的数据:

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii

The above string causes the system to essentially lock up. 上面的字符串使系统从本质上锁定。 I would like to know why and if I can use a regex that accomplishes the same thing in maybe a simpler manner. 我想知道为什么以及是否可以使用以更简单的方式完成相同任务的正则表达式。 My target is that an incorrectly formed e-mail address like for instance the following isn't passed: 我的目标是不传递格式错误的电子邮件地址,例如以下内容:

host.@.host..com

You have nested repetition operators sharing the same characters, which is liable to cause catastrophic backtracking. 您有嵌套的重复运算符共享相同的字符,这可能会导致灾难性的回溯。

For example: ([-.\\w]*[0-9a-zA-Z])* 例如: ([-.\\w]*[0-9a-zA-Z])*

This says: match 0 or more of -._0-9a-zA-Z followed by a single 0-9a-zA-Z , one or more times. 这表示:匹配0个或多个-._0-9a-zA-Z然后匹配单个0-9a-zA-Z一次或多次。

i falls in both of these classes. i属于这两个班。

Thus, when run on iiiiiiii... the regex is matching every possible permuation of (several "i"s followed by one "i") several times (which is a lot of permutations). 因此,当在iiiiiiii...运行时,正则表达式匹配(several "i"s followed by one "i") several times每种可能的置换(several "i"s followed by one "i") several times (这是很多置换)。

In general, validating email addresses with a regular expression is hard. 通常,很难使用正则表达式验证电子邮件地址。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM