简体   繁体   English

Python正则表达式从字符串中提取电话号码

[英]Python regex to extract phone numbers from string

I am very new to regex , Using python re i am looking to extract phone numbers from the following multi-line string text below :我对 regex 很陌生,使用 python re 我希望从下面的多行字符串文本中提取电话号码:

 Source = """<p><strong>Kuala Lumpur</strong><strong>:</strong> +60 (0)3 2723 7900</p>
        <p><strong>Mutiara Damansara:</strong> +60 (0)3 2723 7900</p>
        <p><strong>Penang:</strong> + 60 (0)4 255 9000</p>
        <h2>Where we are </h2>
        <strong>&nbsp;Call us on:</strong>&nbsp;+6 (03) 8924 8686
        </p></div><div class="sys_two">
    <h3 class="parentSchool">General enquiries</h3><p style="FONT-SIZE: 11px">
     <strong>&nbsp;Call us on:</strong>&nbsp;+6 (03) 8924 8000
+ 60 (7) 268-6200 <br />
 Fax:<br /> 
 +60 (7) 228-6202<br /> 
Phone:</strong><strong style="color: #f00">+601-4228-8055</strong>"""

So when i compile the pattern , i should be able to find using所以当我编译模式时,我应该能够找到使用

phone = re.findall(pattern,source,re.DOTALL)

 ['+60 (0)3 2723 7900',
  '+60 (0)3 2723 7900',
  '+ 60 (0)4 255 9000',
  '+6 (03) 8924 8686',
  '+6 (03) 8924 8000',
  '+ 60 (7) 268-6200',
  '+60 (7) 228-6202',
  '+601-4228-8055']

Please help me identify the right pattern请帮我确定正确的模式

This should find all the phone numbers in a given string这应该找到给定字符串中的所有电话号码

re.findall(r'+?(?[1-9][0-9 .-()]{8,}[0-9]', Source) re.findall(r'+?(?[1-9][0-9 .-()]{8,}[0-9]', Source)

 >>> re.findall(r'[\+\(]?[1-9][0-9 .\-\(\)]{8,}[0-9]', Source)
 ['+60 (0)3 2723 7900', '+60 (0)3 2723 7900', '60 (0)4 255 9000', '+6 (03) 8924 8686', '+6 (03) 8924 8000', '60 (7) 268-6200', '+60 (7) 228-6202', '+601-4228-8055']

Basically, the regex lays out these rules基本上,正则表达式列出了这些规则

  1. The matched string may start with + or ( symbol匹配的字符串可能以 + 或 ( 符号开头
  2. It has to be followed by a number between 1-9后面必须跟一个 1-9 之间的数字
  3. It has to end with a number between 0-9它必须以 0-9 之间的数字结尾
  4. It may contain 0-9 (space) .-() in the middle.它可能在中间包含 0-9(空格).-()。

Using re module.使用re模块。

>>> import re
>>> Source = """<p><strong>Kuala Lumpur</strong><strong>:</strong> +60 (0)3 2723 7900</p>
        <p><strong>Mutiara Damansara:</strong> +60 (0)3 2723 7900</p>
        <p><strong>Penang:</strong> + 60 (0)4 255 9000</p>
        <h2>Where we are </h2>
        <strong>&nbsp;Call us on:</strong>&nbsp;+6 (03) 8924 8686
        </p></div><div class="sys_two">
    <h3 class="parentSchool">General enquiries</h3><p style="FONT-SIZE: 11px">
     <strong>&nbsp;Call us on:</strong>&nbsp;+6 (03) 8924 8000
+ 60 (7) 268-6200 <br />
 Fax:<br /> 
 +60 (7) 228-6202<br /> 
Phone:</strong><strong style="color: #f00">+601-4228-8055</strong>"""

>>> for i in re.findall(r'\+[-()\s\d]+?(?=\s*[+<])', Source):
    print i


+60 (0)3 2723 7900
+60 (0)3 2723 7900
+ 60 (0)4 255 9000
+6 (03) 8924 8686
+6 (03) 8924 8000
+ 60 (7) 268-6200
+60 (7) 228-6202
+601-4228-8055
>>> 

I extract the mobile number from string using the below regular expression.我使用以下正则表达式从字符串中提取手机号码。

import re

sent="this is my mobile number 9999922118"
phone = re.search(r'\b[789]\d{9}\b', sent, flags=0)
       if phone:
            phone.group(0)

pattern = "(+)?([0-9]{1,3})?( )?(([0-9]{1,3}))?( )?[(\\d+((-\\d+)+)]{10,15}"模式 = "(+)?([0-9]{1,3})?( )?(([0-9]{1,3}))?( )?[(\\d+((-\\d+ )+)]{10,15}"

import re

sent = "Tampa, FL 33602 PH: 813-202-7100 FAX: 813-221-8837 phone +60 (0)3 2723 7900"
pattern = "(\+)?([0-9]{1,3})?( )?(\([0-9]{1,3}\))?( )?[(\d+((\-\d+)+)]{10,15}"
phone = re.findall(r'{}'.format(pattern), sent, flag=0)

This should find all the phone numbers in the string.这应该找到字符串中的所有电话号码。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM