简体   繁体   中英

Match a regex to the whole string and not just a part of the string

I have a regex: r'((\\+91|0)?\\s?\\d{10})'

I'm trying to match numbers like +91 1234567890 , 1234567790 , 01234567890 .

These numbers shouldn't be matched: 1234568901112 because it doesn't start with +91 or 0 or doesn't have just 10 numbers:

When I try to use re.findall() :

re.findall(r'((\+91|0)?\s?\d{10})', '+91 1234567890, 1234567790, 01234567890, 1234568901112')
[('+91 1234567890', '+91'),
 (' 1234567790', ''),
 (' 0123456789', ''),
 (' 1234568901', '')]

You can notice that in the third and fourth index the output is not what I want. My expected output at third index is 01234568890 and because it starts with 0 and followed by 10 characters. But it's only showing the first 10 characters. Also I don't want the output in the 4th index because it the number doesn't completely match. So either it matched the complete word/string else it is invalid.

Is there any other regex that I can use? Or a function? What am I doing wrong here?

The expected output is:

[('+91 1234567890','1234567790', '01234567890']

Please let me know if any more clarifications are needed.

You may use

r'(?<!\w)(?:(?:\+91|0)\s?)?\d{10}\b'

See the regex demo .

The point is to match these patterns as whole words, the problem is that the first part is optional and one of the optional alteratives starts with a non-word char, so a single \\b word boundary won't work here.

Details

  • (?<!\\w) - there should be no word char immediately to the left of the current location
  • (?:(?:\\+91|0)\\s?)? - an optional occurrence of
    • (?:\\+91|0) - +91 or 0
    • \\s? - an optional whitespace
  • \\d{10}\\b - ten digits matches as a whole word, no word chars allowed on both sides

Python demo :

import re
s = '+91 1234567890, 1234567790, 012345678900, 1234568901112, 01234567890'
print(re.findall(r'(?<!\w)(?:(?:\+91|0)\s?)?\d{10}\b', s))
# => ['+91 1234567890', '1234567790', '01234567890']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM