简体   繁体   中英

Can anyone tell me why this C# email validation regular expression (regex) hangs?

I got a good email vaidation regex from: Email regular expression

    public static void Main(string[] args)
    {
        string value = @"cvcvcvcvvcvvcvcvcvcvcvvcvcvcvcvcvvccvcvcvc";
        var regex = new Regex(
            @"^([0-9a-zA-Z]([-.\w]*[0-9a-zA-Z])*@([0-9a-zA-Z][-\w]*[0-9a-zA-Z]\.)+[a-zA-Z]{2,9})$",
            RegexOptions.Compiled);
        var x = regex.Match(value); // Hangs here !?!
        return;
    }

It works in most cases, but the code above hangs, burning 100% CPU... I've tested in a W8 metro App. and on a standard .Net 4.5 app.

Can anyone tell me why this happens, and if there is a good email validation REGEX that doesn't hang, or if there is a way to fix this one?

Many thanks, Jon

The explanation why it hangs: Catastrophic backtracking .

Let's simplify the crucial part of the regex:

(\w*[0-9a-zA-Z])*@

You have

  • an optional part \\w* that can match the same characters as the following part [0-9a-zA-Z] , so the two combined translate, in essence, to \\w+
  • nested quantifiers: (\\w+)*

This means that, given s = "cvcvcvcvvcvvcvcvcvcvcvvcvcvcvcvcvvccvcvcvc" , this part of the regex needs to check all possible permutations of s (which number at 2**(len(s)-1) ) before deciding on a non-match when the following @ is not found.

Since you cannot validate an e-mail address with any regex (there are far too many corner cases in the spec), it's usually best to

  • do a minimal regex check ( ^.*@.*$ )
  • use a parser to check validity (like @Fake.It.Til.U.Make.It suggested)
  • try and send e-mail to it - even a seemingly valid address may be bogus, so you'd have to do this anyway.

Just for completeness, you can avoid the backtracking issues with the help of atomic groups :

var regex = new Regex(
    @"^([0-9a-zA-Z](?>[-.\w]*[0-9a-zA-Z])*@(?>[0-9a-zA-Z][-\w]*[0-9a-zA-Z]\.)+[a-zA-Z]{2,9})$",
    RegexOptions.Compiled);

Never ever use regex to validate an email..

You can use MailAddress class to validate it

try 
{
    address = new MailAddress(address).Address;
   //address is valid
} 
catch(FormatException)
{
    //address is invalid
}

guess it's because of [-.\\w] in regex, try to use this:

^[a-zA-Z0-9_-]+(?:\.[a-zA-Z0-9_-]+)*@(?:(\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$

Also, in .net 4.5 EmailAttribute should be available, not sure though

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM