简体   繁体   English

使用正则表达式规范化电话号码

[英]Normalize phone numbers using regex

I have a list of phone numbers entered by users without validation and they look like:我有一个未经验证的用户输入的电话号码列表,它们看起来像:

 - 495) 995-0595
 - 105-6439
 - 095 268 8621
 - 324-51-44
 - 7 (495) 995-05-95
 - 7 495 995 05 95
 - 7 (495) 995-0595
 - +7 (495) 995-05-95
 - 7 (495)925-34-89
 - 7(495)9253489
 - 7(495)925-34-89
 - 74959950595

I want to convert these numbers into this (Russian) format: +X (XXX) XXX-XX-XX我想将这些数字转换成这种(俄语)格式:+X (XXX) XXX-XX-XX

Is there any chance here to achieve it using regular expressions?这里有没有机会使用正则表达式来实现它?

Yup, Extract & Reformat!是的,提取并重新格式化!

List<string> oldlist = new List<string>();
List<string> newlist = new List<string>();
foreach(string s in oldlist)
{
     if(s.Contains('(')) s = s.Replace('('), "");//etc
     newlist.Add(numFormat(s));
}

string prefix = "495";

string numFormat(string s)
{
     string my;
     if(s.Length == 7)
     {
         my = string.Format("+7 ({0}) {1} {2} {3}", prefix, s.substring(0,3), s.subtring(3,2), s.substring(5,2);
     }
     else if(s.length == 10)
     {
         my = string.Format("+7 ({0}) {1} {2} {3}", s.substring(0,3), s.substring(3,3), s.subtring(5,2), s.substring(7,2);        
     }
     //etc
     return my;
}

This is just off the top of my head... but you get the idea这只是我的头脑......但你明白了

Run your list through this:通过这个运行你的列表:

var strippedNumbers = new List<string>();
foreach (var num in listOfRussianNumbers.Select(x=>Regex.Replace(x, "[^0-9]", ""))) 
    strippedNumbers.Add(num.Length == 7?"7499"+num:num);

Then use string.Format to print it out how you want然后使用 string.Format 打印出你想要的

string.Format("+{0} ({1}) {2}-{3}-{4}", 
    num.Substring(0,1), 
    num.Substring(1,3),
num.Substring(4,3),
num.Substring(7,2),
num.Substring(9,2));

I think you should do this我认为你应该这样做

  1. Convert it to string.将其转换为字符串。
  2. Using Loop remove anything that does not look like number.使用循环删除任何看起来不像数字的东西。 You can use Char.IsDigit() for this.您可以为此使用Char.IsDigit()
  3. Then do your desired formatting using string.Substring() .然后使用string.Substring()进行所需的格式化。

Make sure you do all these steps in string format only.确保仅以字符串格式执行所有这些步骤。

Like

string str = "495) 995-0595";
List<char> digits = new List<char>();

for (int i = 0; i < str.Length; i++)
{
    if(char.IsDigit(str[i]))
        digits.Add(str[i]);
}

str = new string(digits.ToArray());

str = "+" + str.Substring(0, 1) + " (" + str.Substring(1, 3) + ") " 
      + str.Substring(4, 2) + "-" + str.Substring(6, 2) + "-" + str.Substring(8);

This returned me "+4 (959) 95-05-95"这返回给我"+4 (959) 95-05-95"

This is the best I can get in short notice.这是我能在短时间内得到的最好的。

((\+?\d)\s?)?\(?(\d\d\d)\)?\s?(\d\d\d)(\s|-)?(\d\d)(\s|-)?(\d\d)

This will select the bolded from your sample.这将从您的示例中选择粗体。

495) 995-0595 495) 995-0595
105-6439 105-6439
095 268 8621 095 268 8621
324-51-44 324-51-44
7 (495) 995-05-95 7 (495) 995-05-95
7 495 995 05 95 7 495 995 05 95
7 (495) 995-0595 7 (495) 995-0595
+7 (495) 995-05-95 +7 (495) 995-05-95
7 (495)925-34-89 7 (495)925-34-89
7(495)9253489 7(495)9253489
7(495)925-34-89 7(495)925-34-89
74959950595 74959950595

On the strings that dont match, you can send them through a different routine or for manual processing.对于不匹配的字符串,您可以通过不同的例程发送它们或进行手动处理。

Something like that:像这样的东西:

(\d)? ?\(?(\d\d\d)?\)? *?(\d\d\d) *?-? *?(\d\d) *?-? *?(\d\d)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM