简体   繁体   中英

Problem in inserting commas in the numbers using regex : Python

I am learning lookahead and lookbehind in regex and trying to apply commas to a number between every 3 digits. I am stuck here:

text = "The population of India is 1300598526 and growing."
pattern = re.compile(r'\b(?<=\d)(?=(\d\d\d)+\b)')
line = re.sub(pattern, r',', text)
print(line)

expected output:

"The population of India is 1,300,598,526 and growing"

actual output None . The pattern doesn't match. I have tried fiddling with the pattern and found out that the leading \\b is the culprit. The pattern works fine without this. Why is that so? Please clarify.

I'd prefer to know of the error in the abovementioned pattern instead of a newly crafted pattern to achieve the same thing. Thanks.

Your regex starts with \\b(?<=\\d) and uses a word boundary . That position will only match at the end of 1300598526 . The word boundary does not match between the digits so the part that follows (?=(\\d\\d\\d)+\\b) can not match.

One way to solve this converting digits not inside a word could be to split on a space to get the words. Then map each item and check if the consists of 4 or more digits \\A\\d{4,}\\Z and add the comma after 3 digits using:

\d(?=(?:\d{3})+(?!\d))

Explanation

  • \\d Match a digit
  • (?= positive lookahead to assert what is on the right is
    • (?:\\d{3})+ repeat sets of 3 digits
    • (?!\\d) Negative lookahead to assert what is on the right is not a digit
  • ) Close positive lookahead

For example:

import re

text = "The population 12test1234 of India is 1300598526 and growing."
pattern = re.compile(r"\d(?=(?:\d{3})+(?!\d))")
subst = r"\g<0>,"

res = map(lambda x: re.sub(pattern, subst, x) if re.match(r"\A\d{4,}\Z", x) else x, text.split(' '))
print (" ".join(res))

Result

The population 12test1234 of India is 1,300,598,526 and growing.

See the Regex demo | Python demo

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM