简体   繁体   中英

Python- find substring and then replace all characters within it

Let's say I have this string :

<div>Object</div><img src=#/><p> In order to be successful...</p>

I want to substitute every letter between < and > with a # .

So, after some operation, I want my string to look like:

<###>Object<####><##########><#> In order to be successful...<##>

Notice that every character between the two symbols were replaced with # ( including whitespace).

This is the closest I could get:

   r = re.sub('<.*?>', '<#>', string)

The problem with my code is that all characters between < and > are replaced by a single # , whereas I would like every individual character to be replaced by a # .

I tried a mixture of various back references, but to no avail. Could someone point me in the right direction?

What about...:

def hashes(mo):
    replacing = mo.group(1)
    return '<{}>'.format('#' * len(replacing))

and then

r = re.sub(r'<(.*?)>', hashes, string)

The ability to use a function as the second argument to re.sub gives you huge flexibility in building up your substitutions (and, as usual, a named def results in much more readable code than any cramped lambda -- you can use meaningful names, normal layouts, etc, etc).

The re.sub function can be called with a function as the replacement, rather than a new string. Each time the pattern is matched, the function will be called with a match object, just like you'd get using re.search or re.finditer .

So try this:

re.sub(r'<(.*?)>', lambda m: "<{}>".format("#" * len(m.group(1))), string)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM