簡體   English   中英

從中刪除BOM表字符 <HtmlElement> 蟒蛇

[英]Removing BOM characters from <HtmlElement> Python

我正在嘗試以這種方式從URL加載html標記,然后運行一些xpath查詢,但是頁面源已加載BOM,在運行xpath之前如何刪除它們?

session = requests.Session()

page = session.get(url)

page_data = lxml.html.fromstring(page.text)

輸出:

 u'Re\ufeffverse \ufeffFleece \ufeffHoo\ufeffded S\ufeffwea\ufefftshi\ufeffrt'
session = requests.Session()

page=session.get(url)

page_data = lxml.html.fromstring(page.text)

float=lxml.html.tostring(page_data).replace('&#65279;', '')

page_data = lxml.html.fromstring(float)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM