繁体   English   中英

从中删除BOM表字符 <HtmlElement> 蟒蛇

[英]Removing BOM characters from <HtmlElement> Python

我正在尝试以这种方式从URL加载html标记,然后运行一些xpath查询,但是页面源已加载BOM,在运行xpath之前如何删除它们?

session = requests.Session()

page = session.get(url)

page_data = lxml.html.fromstring(page.text)

输出:

 u'Re\ufeffverse \ufeffFleece \ufeffHoo\ufeffded S\ufeffwea\ufefftshi\ufeffrt'
session = requests.Session()

page=session.get(url)

page_data = lxml.html.fromstring(page.text)

float=lxml.html.tostring(page_data).replace('&#65279;', '')

page_data = lxml.html.fromstring(float)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM