简体   繁体   中英

python re.sub single or multiple characters

I have a lot of strings in the form of

100XX 123XX 1XX 234XXXXX and I would like to replace all the X with 0 's. There is other text in the string too in the form of an address.

234XX N. Somestreet Anytown, USA I can't be sure if numbers followed by X doesn't appear anywhere else so I cannot just replace the X's

I have this code so far but it is only dropping in a single 0 and I need it to drop in a variable amount of 0's..

re.sub(r"([0-9]+)([X]+)", r"\\g<1>0", "234XX")

which will give me 2340 ...I need it to return 23400 or if given 123XXX I need it to return 123000

You can use a callback function to get your desired result, see http://ideone.com/ccB37k

import re

def repl(m):
    return (m.group(1) + m.group(2).replace('X','0'))

str = '234XX N. Somestreet Anytown, USA'
pattern = r'\b(\d+)(X+)\b'
print(re.sub(pattern, repl, str))

What I'd do is use finditer to return MatchObjects of your regex, you can then access functions like start() and end() to rebuild your string. Since this is a direct replace, you can do this in place without worrying about index issues.

import re

res = '234XX N. Somestreet Anytown, USA\n234XXXXXX N. Somestreet Anytown, USA\nXXXXXXXXXX'

for match in re.finditer(r"([0-9]+)([X]+)", res):
    print(match.group(1))
    print(len(match.group(2)))
    # res = res[:match.end(1)] + ('0' * len(match.group(2))) + res[match.end():]
    res = res[:match.end(1)] + match.group(2).replace('X','0') + res[match.end():]

print(res)

what I ended up doing was making a callable and passing that to re.sub

def sub_0_for_x(match):
    old = match.groups()
    return old[0] + "0" * len(match[1])

re.sub("([0-9]+)([0]+)", sub_0_for_x, "123XX Anyplace, USA")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM