简体   繁体   中英

Python Regex not returning phone numbers

Given the following code:

import re

file_object = open("all-OANC.txt", "r")
file_text = file_object.read()

pattern = "(\+?1-)?(\()?[0-9]{3}(\))?(-|.)[0-9]{3}(-|.)[0-9]{4}"

for match in re.findall(pattern, file_text):
    print match

I get output that stretches like this:

('', '', '', '-', '-')
('', '', '', '-', '-')
('', '', '', '-', '-')
('', '', '', '-', '-')
('', '', '', '-', '-')
('', '', '', '-', '-')
('', '', '', '-', '-')
('', '', '', '-', '-')
('', '', '', '-', '-')

I'm trying to find phone numbers, and I am one hundred percent sure there are numbers in the file. When I search for numbers in an online applet for example, with the same expression, I get matches.

Here is a snippet where the expression is found outside of python:

"Slate on Paper," our specially formatted print-out version of Slate, is e-mailed to readers Friday around midday. It also can be downloaded from our site. Those services are free. An actual paper edition of "Slate on Paper" can be mailed to you (call 800-555-4995), but that costs money and can take a few days to arrive."

I want output that at least recognizes the presence of a number

It's your capture groups that are being displayed. Display the whole match:

text = '''"Slate on Paper," our specially formatted print-out version of Slate, is e-mailed to readers Friday around midday. It also can be downloaded from our site. Those services are free. An actual paper edition of "Slate on Paper" can be mailed to you (call 800-555-4995), but that costs money and can take a few days to arrive."'''

pattern = "(\+?1-)?(\()?[0-9]{3}(\))?(-|.)[0-9]{3}(-|.)[0-9]{4}"    

for match in re.finditer(pattern,text):
    print(match.group())

Output:

800-555-4995

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM