I observed, that python-markdown always escapes HTML entities inside backticks, even with safe=False:
In [1]: import markdown
In [2]: markdown.markdown("&")
Out[2]: u'<p>&</p>'
In [3]: markdown.markdown("*&*")
Out[3]: u'<p><em>&</em></p>'
In [4]: markdown.markdown("`&`")
Out[4]: u'<p><code>&amp;</code></p>'
Is it a bug or a feature; is there a way to keep HTML entities unchanged?
Backticks designate a code block , meaning that HTML entities must be escaped so that the code displays correctly, so this isn't a bug. While I don't know why you would want to get around that, and there may be better ways to accomplish your goals, python-markdown
ignores text inside HTML tags, so perhaps enclosing your HTML entities inside do-nothing HTML would suit your purposes.
>>> import markdown
>>> markdown.markdown("<div>`&`</div>")
u'<div>`&`</div>'
If you find the <div>
tags objectionable, you could postprocess them out reasonably simply using a div
class and an HTML parsing tool like BeautifulSoup .
>>> from BeautifulSoup import BeautifulSoup
>>> soup = BeautifulSoup("<div class='nothing'>`&`</div>")
>>> for div in soup.findAll('div', 'nothing'):
... div.replaceWithChildren()
>>> print soup
`&`
Maybe a bit more complicated than what you initially wanted, but I think this is probably the simplest solution short of fundamentally modifying python-markdown
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.