I'm trying to figure out a good way to sanitize / reformat User Generated Content that is written in the Markdown format. I want to 'correct' improper content (as best as possible).
For now I'm sticking to HTML comments ( though I'd appreciate any embedded HTML ).
The markdown format requires any embedded HTML to appear within it's own lines.
Bad (input):
one
<!-- two -->
three
four
five <!-- five.point.five -->
six
Good (output):
one
<!-- two -->
three
four
five
<!-- five.point.five -->
six
您可以使用此:
re.sub(r'\s*(<!--(?:[^-]+|-(?!->))*-->)\s*', '\\n\\n\\1\\n\\n', yourstring)
To convert the first output to the second you would replace <!--
with \\r\\n<!--
and -->
with -->\\r\\n
, or whatever newline character, or constant, is equivalent to \\r\\n
. You could do this with replace()
, probably not requiring regex. [ \\r
is not really necessary.]
You seem to suggest that you are doing this already, so perhaps there is more to your question.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.