简体   繁体   English

grep regex匹配电子邮件地址

[英]grep regex match email address

I have a file test.txt which contains the following content: 我有一个文件test.txt ,其中包含以下内容:

BC@ABSC.CA
ABCabc+-._@mcmaster.io.ca
ABCabc+-._@school.image
ABCabc+-._@school3-computer.image
ABCabc+-._@school3-IT.image.tor.chrome.ca
ABCabc+-._@school3-IT.image.tor.chrome.canadannn
ABC123abc+-._@school3-IT.imageal.tor.chrome.canadannn
ABCabc+-._@school3-*IT.image.tor.chrome.ca
ABCabc+-._@school3-IT.image.tor.chrome.caskdlfj
ABCab*c+-._@school3-IT.image.tor.chrome.caABCabc

I then use 然后我用

grep -E '^[A-Za-z0-9+._-]+@([a-zA-Z0-9-]+\.)+[a-zA-Z]{2,6}' test.txt

trying to match valid email. 尝试匹配有效的电子邮件。 The key here is that the last subdomain has to be a sequence of 2 to 6 char. 这里的关键是最后一个子域必须是2到6个字符的序列。

So I am hopping to the get the following output: 因此,我希望获得以下输出:

BC@ABSC.CA
ABCabc+-._@mcmaster.io.ca
ABCabc+-._@school.image
ABCabc+-._@school3-computer.image
ABCabc+-._@school3-IT.image.tor.chrome.ca

But i also get the following even tho the length of the last domain succeed 6 chars. 但是,即使最后一个域的长度达到6个字符,我可以获得以下内容。

ABCabc+-._@school3-IT.image.tor.chrome.canadannn
ABC123abc+-._@school3-IT.imageal.tor.chrome.canadannn
ABCabc+-._@school3-IT.image.tor.chrome.caskdlfj

How do i solve this problem? 我该如何解决这个问题?

The problem is that grep matches anything in a line. 问题是grep匹配一行中的任何内容。 If you want the exact whole line, add the $ terminator at the end. 如果您想要完整的整行,请在末尾添加$终止符。 Let's look at an example: 让我们看一个例子:

ABCabc+-._@school3-IT.image.tor.chrome.canadannn
  1. ABCabc+-._ matches ^[A-Za-z0-9+._-]+ ABCabc+-._匹配^[A-Za-z0-9+._-]+
  2. @ matches @ @比赛@
  3. school3-IT.image.tor.chrome. matches ([a-zA-Z0-9-]+\\.)+ . 匹配([a-zA-Z0-9-]+\\.)+ As far as I know, all quantifiers are greedy in grep . 据我所知,所有量词在grep都是贪婪的。
  4. canada matches [a-zA-Z]{2,6} canada符合[a-zA-Z]{2,6}
  5. nnn gets ignored nnn被忽略

Without the $ , there just has to be some part of the line that matches, not necessarily the whole thing. 没有$ ,只需要匹配的行的某些部分 ,不一定是整个部分。

Add an endline anchor to your regex: $ : 在您的正则表达式中添加最终锚: $

grep -E '^[A-Za-z0-9+._-]+@([a-zA-Z0-9-]+\.)+[a-zA-Z]{2,6}$' test.txt

More about it: http://www.regular-expressions.info/anchors.html 有关它的更多信息: http : //www.regular-expressions.info/anchors.html

You can fix your query by adding a $ at the end of your string. 您可以通过在字符串末尾添加$来修复查询。

grep -E '^[A-Za-z0-9+._-]+@([a-zA-Z0-9-]+\.)+[a-zA-Z]{2,6}$' test.txt

Here is a live demo: https://regex101.com/r/NtZJQ0/1 这是一个现场演示: https : //regex101.com/r/NtZJQ0/1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM