Is there any way to get around this limitation of re.sub? It is not fully functional for verbose mode (with back reference here) in the replace pattern; it does not eliminate whitespace or comments (yet it does interpret backreferences properly).
import remport re
ft1=r"""(?P<test>[0-9]+)"""
ft2=r"""\g<test>and then: \g<test> #this remains"""
print re.sub(ft1,ft2,"front 1234 back",flags=re.VERBOSE) #Does not work
#result: front 1234and then: 1234 #this remains back
re.VERBOSE does not apply to the replacement pattern... Is there a work-around? (Simpler than working with groups after an re.match.)
Here is the only way I have found to "compile" an re replace expression for sub. There are a few extra constraints: both spaces and newlines have to be written like spaces are written for the re match expression (in square brackets: [ ] and [\\n\\n\\n]) and the whole replace expression should have a verbose newline at the beginning.
An example: this searches a string and detects a word repeated after /ins/ and /del/, then replaces those occurrences with a single occurrence of the word in front of .
Both the match and the replace expressions are complex, which is why I want a verbose version of the replace expression.
===========================
import re
test = "<p>Le petit <ins>homme à</ins> <del>homme en</del> ressorts</p>"
find=r"""
<ins>
(?P<front>[^<]+) #there is something added that matches
(?P<delim1>[ .!,;:]+) #get delimiter
(?P<back1>[^<]*?)
</ins>
[ ]
<del>
(?P=front)
(?P<delim2>[ .!,;:]+)
(?P<back2>[^<]*?)
</del>
"""
replace = r"""
<<<<<\g<front>>>>> #Pop out in front matching thing
<ins>
\g<delim1>
\g<back1>
</ins>
[ ]
<del>
\g<delim2> #put delimiters and backend back
\g<back2>
</del>
"""
flatReplace = r"""<<<<<\g<front>>>>><ins>\g<delim1>\g<back1></ins> <del>\g<delim2>\g<back2></del>"""
def compileRepl(inString):
outString=inString
#get space at front of line
outString=re.sub(r"\n\s+","\n",outString)
#get space at end of line
outString=re.sub(r"\s+\n","",outString)
#get rid of comments
outString=re.sub(r"\s*#[^\n]*\n","\n",outString)
#preserve space in brackets, and eliminate brackets
outString=re.sub(r"(?<!\[)\[(\s+)\](?!\[)",r"\1",outString)
# get rid of newlines not in brackets
outString=re.sub(r"(?<!\[)(\n)+(?!\])","",outString)
#get rid of brackets around newlines
outString=re.sub(r"\[((\\n)+)\]",r"\1",outString)
#trim brackets
outString=re.sub(r"\[\[(.*?)\]\]","[\\1]",outString)
return outString
assert(flatReplace == compileRepl(replace))
print test
print compileRepl(replace)
print re.sub(find,compileRepl(replace),test, flags=re.VERBOSE)
#<p>Le petit <ins>homme à</ins> <del>homme en</del> ressorts</p>
#<<<<<\g<front>>>>><ins>\g<delim1>\g<back1></ins> <del>\g<delim2>\g<back2></del>
#<p>Le petit <<<<<homme>>>><ins> à</ins> <del> en</del> ressorts</p>
You can first use re.compile to compile regular expressions. Here, you can make use of re.VERBOSE
flag. Later, you can pass these compiled expressions as argument to re.sub()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.