[英]How to convert a String into a BeautifulSoup object?
我正在嘗試抓取新聞網站,我需要更改一個參數。 我用下一個代碼替換了它:
while i < len(links):
conn = urllib.urlopen(links[i])
html = conn.read()
soup = BeautifulSoup(html)
t = html.replace('class="row bigbox container mi-df-local locked-single"', 'class="row bigbox container mi-df-local single-local"')
n = str(t.find("div", attrs={'class':'entry cuerpo-noticias'}))
print(p)
問題是“t”類型是字符串,而使用屬性查找只適用於<class 'BeautifulSoup.BeautifulSoup'>
類型。 你知道怎么把“t”轉換成那種類型嗎?
只需在解析之前進行替換:
html = html.replace('class="row bigbox container mi-df-local locked-single"', 'class="row bigbox container mi-df-local single-local"')
soup = BeautifulSoup(html, "html.parser")
請注意,也可以(我甚至會說首選 )解析HTML,定位元素並修改 Tag
實例的屬性 ,例如:
soup = BeautifulSoup(html, "html.parser")
for elm in soup.select(".row.bigbox.container.mi-df-local.locked-single"):
elm["class"] = ["row", "bigbox", "container", "mi-df-local", "single-local"]
請注意, class
是一個特殊的多值屬性 - 這就是我們將值設置為單個類列表的原因。
演示:
from bs4 import BeautifulSoup
html = """
<div class="row bigbox container mi-df-local locked-single">test</div>
"""
soup = BeautifulSoup(html, "html.parser")
for elm in soup.select(".row.bigbox.container.mi-df-local.locked-single"):
elm["class"] = ["row", "bigbox", "container", "mi-df-local", "single-local"]
print(soup.prettify())
現在看看div
元素類是如何更新的:
<div class="row bigbox container mi-df-local single-local">
test
</div>
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.