简体   繁体   English

使用正则表达式验证电子邮件地址

[英]E-mail address validation using Regular Expressions

I'm writing a simple, small app that allows me to share information. 我正在编写一个简单的小型应用程序,可以共享信息。 I have a question on using regx to validate email address. 我对使用regx验证电子邮件地址有疑问。 I'm kind learning on my own. 我很善于独自学习。 But when it comes to real-world examples, such that strings that can be validated with regular expressions, I'm kind stuck. 但是,当涉及到现实世界中的示例(例如可以使用正则表达式验证的字符串)时,我感到很困惑。

Exercise: Untangle the following regular expression that validates an email address: 练习:解开以下验证电子邮件地址的正则表达式:

  [a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?

It looks like a jumble of characters. 看起来像是一堆字符。

Can someone please explain to me how does this work? 有人可以告诉我这是如何工作的吗?

I try to use this online resources by by Jan Goyvaerts. 我尝试使用Jan Goyvaerts的在线资源 Any help I will appreciate it. 任何帮助,我将不胜感激。

First of all, there is a good thread about totally the same thing: Using a regular expression to validate an email address 首先,关于同一件事有一个很好的线索: 使用正则表达式验证电子邮件地址

Then, below there is the explanation of your regular expression: 然后,下面是您的正则表达式的说明:

[a-z0-9!#$%&'*+/=?^_`{|}~-]+

- The square brackets represent the symbol class, containing all the symbols which are in the square brackets. -方括号代表符号类,包含方括号中的所有符号。 The plus sign ('+') is a quantifier, which means that the sequence of symbols, represented by this symbol class must be at least one character long. 加号('+')是一个量词,表示由该符号类表示的符号序列必须至少一个字符长。

Also, the '+' is greedy , and, therefore, this part of the pattern will match the symbol sequence of the maximal possible length. 同样,“ +”是贪婪的 ,因此,模式的这一部分将匹配最大可能长度的符号序列。

Talking about the square brackets contents, 'az' means any symbol in a range, which could be described mathematically as [a, z], and '0-9' is similar. 谈到方括号的内容,“ az”表示范围内的任何符号,可以用数学方式将其描述为[a,z],而“ 0-9”相似。 All the other symbols are just symbols in this case. 在这种情况下,所有其他符号只是符号。

(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*

- In Regular Expressions, the brackets represent grouping, and the asterisk ('*') is a greedy quantifier, which means "occurs zero or more times". -在正则表达式中,方括号表示分组,星号(*)是贪婪的量词,表示“出现零次或多次”。 So here we are not sure if we are going to find the brackets content, but we do not rule out the possibility. 因此,在这里我们不确定是否要查找方括号的内容,但是我们不排除可能性。

Then, inside the brackets, we see the ?: character combination, which, being put inside brackets tells us that the symbol group inside should not be captured as a sub-string for the further reference. 然后,在方括号内,我们看到?:字符组合,将其放在方括号内可告诉我们,内部的符号组不应作为子字符串捕获,以供进一步参考。

Going further, \\. 更进一步,\\。 means just a usual dot (see Escape sequence ), since a dot symbol is a meta-symbol in Regex. 表示一个普通的点(请参阅转义序列 ),因为点符号是Regex中的元符号。

After the dot we see again the character of symbols, explained above. 在点之后,我们再次看到符号的字符,如上所述。

@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+

- Here we see the at symbol ('@'), which is just a symbol here, then there is a non-capturing symbol group, which will occur one or more times (because of + after it), and which includes a single symbol of [a-z0-9] class and another non-capturing group of symbols, which contents you can totally describe using my explanations above except for a question mark sign ('?'), which means "either once or not at all" in this context (ie if it is used as a quantifier). -在这里我们看到at符号('@'),这里只是一个符号,然后有一个非捕获符号组,该组将出现一次或多次(由于其后有+),并且其中包括一个[a-z0-9]类的符号和另一组非捕获符号,您可以使用我上面的解释完全描述这些内容,但问号('?')除外,它表示“一次或完全没有”在这种情况下(即,是否用作量词)。

[a-z0-9](?:[a-z0-9-]*[a-z0-9])?

- This last part is similar to what is found in a symbol group, explained above, so I believe you have now enough information to understand it. -这最后一部分类似于上面解释的在符号组中找到的部分,因此,我相信您现在已经具有足够的信息来理解它。

More on quantifier types here: Greedy vs. Reluctant vs. Possessive Quantifiers . 有关量词类型的更多信息,请参见: 贪婪量词,勉强量词和占有量词

A good Regular Expressions reference: Regular Expression Language - Quick Reference 好的正则表达式参考: 正则表达式语言-快速参考

Some information on capturing in Regular Expressions: Regex Tutorial - Parentheses for Grouping and Capturing 有关在正则表达式中进行捕获的一些信息: Regex教程-用于分组和捕获的括号

About special characters: Regex Tutorial - Literal Characters and Special Characters 关于特殊字符: Regex教程-文字和特殊字符

Regex statements can be a fun yet tricky to follow. 正则表达式声明可能很有趣,但很难遵循。 There are 5 parts to this statement. 该声明分为五个部分。

One valid characters for a username 用户名的一个有效字符

[a-z0-9!#$%&'*+/=?^_`{|}~-]+

check for a single '.' 检查单个“。” and any additional amount of characters 以及任何其他数量的字符

(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*

The '@' symbol “ @”符号

Valid second / lower level domain 有效的二级/低级域

(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+

A valid top level domain 有效的顶级域名

[a-z0-9](?:[a-z0-9-]*[a-z0-9])?

I recommend http://www.ultrapico.com/expresso.htm . 我建议http://www.ultrapico.com/expresso.htm It will break the statement down for you. 它将为您分解该声明。

I've found a remarkable tool for visualizing regular expressions here: http://regexper.com 我在这里找到了一个出色的可视化正则表达式的工具: http : //regexper.com

It shows me that your regular expression breaks down like this. 它告诉我您的正则表达式会像这样分解。 Hopefully this helps explain it. 希望这有助于解释它。

在此处输入图片说明

  1. [a-z0-9!#$%&'*+/=?^_`{|}~-]+
    This looks for at least one of of the characters given here (az, 0-9, and those special characters). 这将查找此处给出的至少一个字符(az,0-9和那些特殊字符)。
  2. (?:\\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)
    This looks for the same as above, but only when it stands after a dot. 这看起来与上面相同,但仅当它位于一个点之后时。 This part is optional and can be repeated indefinitely. 这部分是可选的,可以无限期重复。 It prevents dots at the end of the name. 它防止名称末尾出现点。
  3. @
    Matches the @ symbol 匹配@符号
  4. (?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\\.)+
    This matches az , 0-9 ending with a dot and optional - in the middle ending with a dot. 这会匹配az ,以点号结尾的0-9和可选的-在中间以点号结尾的。 This has to be matched at least once. 必须至少匹配一次。
  5. [a-z0-9](?:[a-z0-9-]*[a-z0-9])?
    This looks for az or 0-9 , optionally followed by az , 0-9 , - , but it cant end with a - again. 这将查找az0-9 ,还可以选择az0-9- ,但不能再以-结束。

See this answer . 看到这个答案 The problem is probably too difficult to solve. 这个问题可能太难解决了。 Two problems you have here. 您在这里遇到两个问题。 1. RegEx are not easy. 1. RegEx并不容易。 2. Escaping special characters is messy. 2.转义特殊字符会造成混乱。 Finally, Email addresses are complicated. 最后,电子邮件地址很复杂。 I probably recommend you to study this post if you are really interested. 如果您真的有兴趣,我可能建议您学习这篇文章。

Two Suggestions I have for you. 我有两个建议给您。

  1. Escaping special characters is messy. 转义特殊字符是一团糟。 2. Email addresses are complicated. 2.电子邮件地址很复杂。 I probably recommend you to study this post if you are really interested. 如果您真的有兴趣,我可能建议您学习这篇文章。 Please check out this other posts: Validation in Regex and Regex Help . 请查看其他帖子: RegexRegex帮助中的 验证

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM