简体   繁体   English

如何将特殊字符转换为html实体?

[英]How to convert special characters into html entities?

I want to convert, in python, special characters like "%$!&@á é ©" and not only '<&">' as all the documentation and references I've found so far shows. cgi.escape doesn't solve the problem. 我想在python中转换特殊字符,如"%$!&@á é ©" ,而不仅仅是'<&">'因为我到目前为止所有的文档和参考文献都显示出来.cgi.escape不会解决这个问题。

For example, the string "á ê ĩ &" should be converted to "&aacute; &ecirc; &itilde; &amp;" 例如,字符串"á ê ĩ &"应转换为"&aacute; &ecirc; &itilde; &amp;" .

Does anyboy know how to solve it? anyboy是否知道如何解决它? I'm using python 2.6. 我正在使用python 2.6。

You could build your own loop using the dictionaries you can find in http://docs.python.org/library/htmllib.html#module-htmlentitydefs 您可以使用http://docs.python.org/library/htmllib.html#module-htmlentitydefs中的词典构建自己的循环。

The one you're looking for is htmlentitydefs.codepoint2name 您正在寻找的是htmlentitydefs.codepoint2name

I found a built in solution searching for the htmlentitydefs.codepoint2name that @Ruben Vermeersch said in his answer. 我找到了一个内置的解决方案来搜索@Ruben Vermeersch在他的回答中说的htmlentitydefs.codepoint2name。 The solution was found here: http://bytes.com/topic/python/answers/594350-convert-unicode-chars-html-entities 解决方案在这里找到: http//bytes.com/topic/python/answers/594350-convert-unicode-chars-html-entities

Here's the function: 这是功能:

def htmlescape(text):
    text = (text).decode('utf-8')

    from htmlentitydefs import codepoint2name
    d = dict((unichr(code), u'&%s;' % name) for code,name in codepoint2name.iteritems() if code!=38) # exclude "&"    
    if u"&" in text:
        text = text.replace(u"&", u"&amp;")
    for key, value in d.iteritems():
        if key in text:
            text = text.replace(key, value)
    return text

Thank you all for helping! 谢谢大家的帮助! ;) ;)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM