简体   繁体   中英

Python -- Regex match pattern OR end of string

import re
re.findall("(\+?1?[ -.]?\(?\d{3}\)?[ -.]?\d{3}[ -.]?\d{4})(?:[ <$])", "+1.222.222.2222<")

The above code works fine if my string ends with a "<" or space. But if it's the end of the string, it doesn't work. How do I get +1.222.222.2222 to return in this condition:

import re
re.findall("(\+?1?[ -.]?\(?\d{3}\)?[ -.]?\d{3}[ -.]?\d{4})(?:[ <$])", "+1.222.222.2222")

*I removed the "<" and just terminated the string. It returns none in this case. But I'd like it to return the full string -- +1.222.222.2222

POSSIBLE ANSWER:

import re
re.findall("(\+?1?[ -.]?\(?\d{3}\)?[ -.]?\d{3}[ -.]?\d{4})(?:[ <]|$)", "+1.222.222.2222")

I think you've solved the end-of-string issue, but there are a couple of other potential issues with the pattern in your question:

  • the - in [ -.] either needs to be escaped as \- or placed in the first or last position within square brackets, eg [-. ] [-. ] or [.-] ; if you search for [] in the docs here you'll find the relevant info:
Ranges of characters can be indicated by giving two characters and separating them 
by a '-', for example [a-z] will match any lowercase ASCII letter, [0-5][0-9] will match
all the two-digits numbers from 00 to 59, and [0-9A-Fa-f] will match any hexadecimal
digit. If - is escaped (e.g. [a\-z]) or if it’s placed as the first or last character
(e.g. [-a] or [a-]), it will match a literal '-'.
  • you may want to require that either matching parentheses or none are present around the first 3 of 10 digits using (?:\(\d{3}\)?|\d{3}[-. ]?)

Here's a possible tweak incorporating the above

import re
pat = "^((?:\+1[-. ]?|1[-. ]?)?(?:\(\d{3}\) ?|\d{3}[-. ]?)\d{3}[-. ]?\d{4})(?:[ <]|$)"
print( re.findall(pat, "+1.222.222.2222") )
print( re.findall(pat, "+1(222)222.2222") )
print( re.findall(pat, "+1(222.222.2222") )

Output:

['+1.222.222.2222']
['+1(222)222.2222']
[]

Maybe try:

import re
re.findall("(\+?1?[ -.]?\(?\d{3}\)?[ -.]?\d{3}[ -.]?\d{4})(?:| |<|$)", "+1.222.222.2222")
  • null matches any position, +1.222.222.2222
  • matches space character, +1.222.222.2222
  • < matches less-than sign character, +1.222.222.2222<
  • $ end of line, +1.222.222.2222

You can also use regex101 for easier debugging.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM