简体   繁体   English

C#-正则表达式验证所有电话号码?

[英]C# - Regex to validate all phone numbers?

I found the following Regex to validate all possible phone numbers, and tested it on this Regex validator : 我找到以下正则表达式来验证所有可能的电话号码,并在此正则表达式验证器上进行了测试:

^\s*(?:\+?(\d{1,3}))?([-. (]*(\d{3})[-. )]*)?((\d{3})[-. ]*(\d{2,4})(?:[-.x ]*(\d+))?)\s*$

Why is it, then, when I use it in my code, it does not match the following number? 为什么当我在代码中使用它时,为什么它与以下数字不匹配?

string text = "Herzeliya, Israel Tel: 972-52-2650599 Born 17/1/1975,";

List<string> Phones = new List<string>();

Regex phon1Regex = new Regex(@"^\s*(?:\+?(\d{1,3}))?([-. (]*(\d{3})[-. )]*)?((\d{3})[-. ]*(\d{2,4})(?:[-.x ]*(\d+))?)\s*$"); 
MatchCollection phon1Matches = phon1Regex.Matches(text);

foreach (Match phon1Match in phon1Matches) 
    Phones.Add(phon1Match.Value);

The list Phones remains empty. Phones列表保持为空。

What am I missing here? 我在这里想念什么?

You do not just want to check if a Phone numbers String representation appears valid, but you want to find it in a much larger string. 您不仅要检查“电话号码字符串”表示形式是否有效,还想在更大的字符串中找到它。 Those two operations are totally different and should thus be solved seperately. 这两个操作完全不同,因此应分别解决。 There just can not be a perfect "one fits all" regex Solution. 不可能有一个完美的“万能”正则表达式解决方案。 If there is, Cultures failed at being uselessly distinct from one another and they realy do not like that ;) 如果存在的话,文化便会因彼此之间无用的区别而失败,他们确实不喜欢那样;)

Ideally you should not have all this Data in a single string. 理想情况下,您不应将所有这些数据都放在一个字符串中。 String is the 2nd hardest to Automate format (only raw binary is worse). 字符串是第二难以自动格式化的格式(只有原始二进制更糟糕)。 Parsing those will be a pain. 解析这些将很痛苦。 At the very least, those strings should have proper Comma seperation between segments or key/value pairs. 至少,这些字符串在段或键/值对之间应具有适当的逗号分隔。 If you can modify the source to be more Automation Friendly, do that first. 如果您可以将源修改为更加“自动化友好”,请首先执行此操作。 Even some XML output or proper CSV would be a huge step upwards. 甚至一些XML输出或适当的CSV都将是巨大的进步。

Phone Number recognition is like any other Number recognition: The format is not fixed and indeed varries by culture as much as DateTime and other Numbers: 电话号码识别与其他任何数字识别一样:该格式不是固定的,实际上与日期时间和其他数字一样,在文化上也各不相同:

Step 1 should be to split this large string into discrete string segments for: 步骤1应该是将大字符串拆分为离散的字符串段,用于:

  • Place (Herzeliya, Israel); 地方(以色列黑泽利亚); maybe City and Country as seperate fields 也许城市和乡村是分开的领域
  • Telephone Number (972-52-2650599) 电话号码(972-52-2650599)
  • Date of Birth (17/1/1975) 出生日期(17/1/1975)

Then you can think about parsing each of those strings, including the Telephone Number. 然后,您可以考虑解析每个字符串,包括电话号码。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM