简体   繁体   中英

python replace the content in between angle brackets (<>)

I want to replace the content in between <>

ex:

input: this is a < test >

output: this is a < hh >

so far I have:

test = 'this is a <test>'
test = re.sub(r'\<[^>]*\>', 'hh', test)
print (test)

this will always erase the <> and result an output like: this is a hh. But what I want is: this is a < hh >

how should I fix it

As thefourtheye suggests, one solution is to do

newstr = 'hh'
test = re.sub(r'\<[^>]*\>', '<' + newstr + '>', test)

But I suspect a more optimal solution with re .

You can use the following:

hh = re.sub(r'(?!<)[^<]*(?=>)', 'hh', test)

demo

This uses a negative lookahead to match the < before the desired pattern, and a positive lookahead to match the > after it, without capturing them.

When have your regex made up, you can put brackets around the parts that you want to capture and recall when substituting.

Your example below shows this method. To be clear, first you define the < and > with brackets and in between is the regex for a word of undefined size. For the substitution, you recall the first time you captured the input, then comes the 'hh', and then you recall the second instance of captured input string. Recalling the positions is done by using a backslash \\ followed by the number of the instance.

import re

test = "<test>"
myre = r'(<)\w*(>)'
mysub = r'\1hh\2'
newstring = re.sub(myre, mysub, string)

You could use a positive lookahead and lookbehind.

>>> import re
>>> test = 'this is a <test>'
>>> test = re.sub(r'(?<=<)[^><]*(?=>)', r'hh', test)
>>> print test
this is a <hh>

Your regex would match these < , > symbols. So it got removed from the final result. But using lookarounds, you could kept the symbols from not getting matched. Look-arounds are zero width assertions which won't consume any characters.

test = 'this is a <test>'
test = re.sub(r'\<[^>]*\>', '<hh>', test)
print (test)

Can be simply done like this.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM