[英]How to use the python make the Escape Sequence to Character Entities
I am a fresh pythoner, thank for help me. 我是一个新鲜的蟒蛇,谢谢你的帮助。 I just want to make the Escape Sequence to Character Entities, like the
<
我只想制作Escape Sequence to Character Entities,比如
<
change to <
, but one HTML page have many different Escape Sequence, I can not write many replace statement,like: 改为
<
,但是一个HTML页面有很多不同的转义序列,我不能写很多替换语句,如:
str = str.replace(' ', ' ')
...............many code.........
str = str.replace('<', '<')
str = str.replace('>', '>')
It is so long....I just want to have a fun or def, that can make the problem easily. 它太长了......我只想拥有一个有趣或def,这可以轻松解决问题。 Thank you very much
非常感谢你
Use HTMLParser.HTMLParser
: 使用
HTMLParser.HTMLParser
:
>>> from HTMLParser import HTMLParser
>>> # from html.parser import HTMLParser # In Python 3.x
>>>
>>> parser = HTMLParser()
>>> parser.unescape('>_<')
u'>_<'
>>> parser.unescape('012')
u'012'
NOTE : HTMLParser.unescape(' ')
returns NO-BREAK SPACE (U+00A0) instead of SPACE. 注意 :
HTMLParser.unescape(' ')
返回NO-BREAK SPACE(U + 00A0)而不是SPACE。
>>> parser.unescape(' ')
u'\xa0'
BTW, Don't use str
as a variable name, it shadows a builtin function str
. 顺便说一句,不要使用
str
作为变量名,它会影响内置函数str
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.