在Python中转义XML属性

Question

如何转义字符串以便可以在XML属性中使用？

我不是从python代码中以编程方式设置属性-我只需要创建一个可用作XML属性的有效字符串即可。

我努力了：

from xml.sax.saxutils import escape, quoteattr

print (escape('<&% "eggs and spam" &>'))
# >>> &lt;&amp;% "eggs and spam" &amp;&gt;

print (quoteattr('<&% "eggs and spam" &>'))
# >>> '&lt;&amp;% "eggs and spam" &amp;&gt;'

问题在于escape()和quoteattr()都没有转义双引号字符，即" 。

当然，我可以对转义的字符串执行.replace('"', '"') ，但我假设应该有一种方法可以使用现有的API（来自标准库或第三方）模块，例如lxml ）。

更新：我发现Python3的html.escape产生了预期的结果，但是我不愿意在XML上下文中使用它，因为我假设HTML转义可能遵循与XML标准要求不同的规范（ https： //www.w3.org/TR/xml/#AVNormalize ）。

Answer 1

从tornado偷偷偷走（经过一些修改）：

import re
_XHTML_ESCAPE_RE = re.compile('[&<>"\']')
_XHTML_ESCAPE_DICT = {'&': '&amp;', '<': '&lt;', '>': '&gt;', '"': '&quot;',
                      '\'': '&#39;'}

def xhtml_escape(value):
    """Escapes a string so it is valid within HTML or XML.

    Escapes the characters ``<``, ``>``, ``"``, ``'``, and ``&``.
    When used in attribute values the escaped strings must be enclosed
    in quotes.

    .. versionchanged:: 3.2

       Added the single quote to the list of escaped characters.
    """
    return _XHTML_ESCAPE_RE.sub(lambda match: _XHTML_ESCAPE_DICT[match.group(0)],
                                value)

在Python中转义XML属性

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-01-29 15:49:36

在Python中转义XML属性

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-01-29 15:49:36

解决方案1
1 已采纳 2019-01-29 15:49:36