简体   繁体   中英

How to replace string, new line and spaces together in python

I have a long string called webpage contains something like this:

"<!--  \n    <div class=\"section_content\"> \n    </div>\n\n-->  "

I want to replace the comments symbol "<!--" and "-->" with spaces. However I cannot directly replace them, since there are other real comments, like "<!-- comments -->" , in the long string.

I was trying to use

re.sub(r"<!--\s+\n\s+<div",r"\n<div",webpage,flags=re.MULTILINE)

But it does not work at all. Can someone help? The result should be "\\n <div class=\\"section_content\\"> \\n </div>\\n\\n" .

This should do:

import re

regex = r"<!--(\s*\n\s*<div[^>]*>\s*\n\s*</div>\n\n)-->"
string = "<!--  \n    <div class=\"section_content\"> \n    </div>\n\n-->  "
res = re.sub(regex, r"\1", string)
print res

Result:

"  \n    <div class=\"section_content\"> \n    </div>\n\n"

Then, if you don't want newlines and spaces at the ends of the string you can use the .strip() method of the string object.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM