简体   繁体   English

正则表达式-将电子邮件地址与例外匹配

[英]Regular Expression - Match Email Address with Exceptions

Please read the question carefully, it's not about validating email addresses! 请仔细阅读问题,这与验证电子邮件地址无关!

I'm trying to construct a regular expression (currently in C#) that extracts all email addresses from a text, with two specific exceptions. 我正在尝试构造一个正则表达式(当前在C#中),该正则表达式从文本中提取所有电子邮件地址,但有两个特定的例外。

I got: 我有:

  • user1@company.com user1@company.com
  • user2@company.com user2@company.com
  • user3@company.com user3@company.com
  • user1@private.com user1@private.com
  • user2@private.com user2@private.com

all in the same text file on the same line, delimited by whitespace character. 所有内容都在同一行的同一文本文件中,以空格字符分隔。

At first I tried to match all of these email addresses except the ones starting with "user1". 最初,我尝试匹配所有这些以“ user1”开头的电子邮件地址。 I used: 我用了:

[\S]*(?<!user1)@[\S]*\..[a-zA-Z.]{1,}

which works well. 效果很好。 Now I have another requirement that sais: Also do not match if the complete email address matches "user2@private.com". 现在,我还有另一个要求:如果完整的电子邮件地址匹配“ user2@private.com”,则也将不匹配。 So it should match "user2@company.com", therefore I can't use: 因此它应该匹配“ user2@company.com”,因此我不能使用:

[\S]*(?<!(user1|user2))@[\S]*\..[a-zA-Z.]{1,}

Therefore I tried an additional negative lookbehind: 因此,我在后面尝试了另外一个负面的看法:

([\S]*(?<!user1)@[\S]*\..[a-zA-Z.]{1,})(?<!user2@private\.com)

which doesn't work because it seems to be satisfied with matching "user2@private.co" I guess. 这是行不通的,因为我猜似乎对匹配“ user2@private.co”很满意。 Is there any way to achieve what I'm trying to do? 有什么方法可以实现我的目标? My head already hurts,... 我的头已经疼了...

I would use additional code, but as I'm using a third party software that only gives me the option of Regular Expression, and only the option of a single regular expression, that's all I've got,... 我会使用其他代码,但是由于我使用的第三方软件仅给我正则表达式的选项,而只有单个正则表达式的选项,这就是我所拥有的全部...

A single regex solution that does not look nice is 一个看起来不太好的正则表达式解决方案是

(?<!\S)(?!user1@|user2@private\.com(?!\S))\S+@\S+\.[a-zA-Z]{2,}(?!\S)

See the regex demo . 参见regex演示

Details : 详细资料

  • (?<!\\S) - a position not preceded with a non-whitespace char (?<!\\S) -不以非空白char开头的位置
  • (?!user1@|user2@private\\.com(?!\\S)) - that position cannot be followed with user1@ or user2@private.com not followed with a non-whitespace char (?!user1@|user2@private\\.com(?!\\S)) -该位置不能跟随user1@user2@private.com不能跟随非空格字符
  • \\S+ - 1+ non-whitespace \\S+ -1+非空格
  • @ - a literal @ @ -文字@
  • \\S+ - 1+ non-whitespace \\S+ -1+非空格
  • \\. - a dot -一个点
  • [a-zA-Z]{2,}(?!\\S) - 2 or more ASCII letters not followed with a non-whitespace char. [a-zA-Z]{2,}(?!\\S) -2个或更多ASCII字母,后跟非空格字符。

A more readable approach is to split with whitespace, get the items matching @"^\\S+@\\S+\\.\\S+$" and use a bit of code to filter out unwanted matches: 更具可读性的方法是使用空格分割,获取与@"^\\S+@\\S+\\.\\S+$"匹配的项目,并使用一些代码来过滤掉不需要的匹配项:

var s = @"Text user1@company.com here user2@company.com and user3@company.com here user1@private.com more user2@private.com";
var result = s.Split().Where(m => 
        Regex.IsMatch(m, @"^\S+@\S+\.\S+$") && m != "user2@private.com" && !m.StartsWith("user1@"));
foreach (var str in result)
    Console.WriteLine(str);
// => user2@company.com, user3@company.com

See C# demo . 参见C#演示

You should be able to use a negative look ahead instead. 您应该可以使用否定的前瞻。 The following solution should work if you have explicit emails you need to filter out. 如果您需要过滤掉明确的电子邮件,则以下解决方案应该可以使用。 But keep in mind that it isn't exactly scalable. 但是请记住,它不是完全可扩展的。 You would not want to have thousands of emails applied here. 您不希望在此处应用数千封电子邮件。

^(?!user1|user2(?!@company.com))[\\S]*@[\\S]*\\..[a-zA-Z.]{1,}

If you suspect that many of these rules could be applied at a future date then you might need to think about a better approach. 如果您怀疑将来可能会应用许多规则,那么您可能需要考虑一种更好的方法。 If the emails to be filtered out are explicit (not patterns) then you could maintain a blacklist somewhere and filter them out after you have extracted/validated email address patterns. 如果要过滤的电子邮件是显式的(不是模式),则可以在某处维护黑名单,并在提取/验证电子邮件地址模式后将其过滤掉。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM