简体   繁体   中英

re.findall() returns extra data when using optionals in between Regex expressions

I seem to be getting additional variables that I do not want stored into this array. What I expected to return after running the following code is this

[('999-999-9999'), ('999 999 9999'), ('999.999.9999')]

However what I end up with is the following

[('999-999-9999', '-', '-'), ('999 999 9999', ' ', ' '), ('999.999.9999', '.', '.')]

The following is what I have

teststr = '''
    Phone: 999-999-9999,
           999 999 9999,
           999.999.9999
'''
phoneRegex = re.compile(r'(\d{3}(-|\s|\.)\d{3}(-|\s|\.)\d{4})')

regexMatches = phoneRegex.findall(teststr)
print(regexMatches)

Turn the inner capturing groups to non-capturing groups.

(?:-|\s|\.)

or

[-\s.]

Example:

>>> import re
>>> teststr = '''
    Phone: 999-999-9999,
           999 999 9999,
           999.999.9999
'''
>>> re.findall(r'\b(\d{3}[-.\s]\d{3}[.\s-]\d{4})\b', teststr)
['999-999-9999', '999 999 9999', '999.999.9999']
>>> 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM