[英]How can I find a specific div tag using bs4
我正在嘗試解析 html 文本並找到所有“<div class="topclick-list-element-game-merchant">”標簽。
warpips = open('WarpipsPageText.txt', 'r')
page_text = warpips.read()
warpips.close()
bs4 = BeautifulSoup(page_text, 'html5lib')
div = bs4.find_all('div class="topclick-list-element-game-merchant"', bs4)
print(div)
當我運行此代碼時,它會打印一個空列表。
下面是我試圖隔離的 html 的片段。
<div class="topclick-list-element-game-merchant">
CDKeys.com
<div class="platform platform-pc" title="pc"></div>
<div class="platform platform-xbox" title="xbox"></div>
</div>
</div>
<span class="topclick-list-element-price">$11.19</span>
</a>
<a href="https://cheapdigitaldownload.com/nier-replicant-ver-1-22474487139-digital-download-price-comparison/" title="NieR Replicant ver.1.22474487139 cd key best prices" class="topclick-list-element ">
<div class="topclick__image tpsprite11 tpsprite11-82-buy-nier-replicant-ver-1-22474487139-cd-key-pc-download-catalog-0" data-tp="tpsprite11"></div>
<div class="topclick-list-element-game">
<div class="topclick-list-element-game-title">NieR Replicant ver.1.22474487139</div>
<div class="topclick-list-element-game-merchant">
你放錯了一個單引號。
div = bs4.find_all('div', class="topclick-list-element-game-merchant")
要查找所有帶有class="topclick-list-element-game-merchant"
<div>
,您可以使用以下示例:
from bs4 import BeautifulSoup
html_doc = """
<div class="topclick-list-element-game-merchant">
CDKeys.com
<div class="platform platform-pc" title="pc"></div>
<div class="platform platform-xbox" title="xbox"></div>
</div>
</div>
<span class="topclick-list-element-price">$11.19</span>
</a>
<a href="https://cheapdigitaldownload.com/nier-replicant-ver-1-22474487139-digital-download-price-comparison/" title="NieR Replicant ver.1.22474487139 cd key best prices" class="topclick-list-element ">
<div class="topclick__image tpsprite11 tpsprite11-82-buy-nier-replicant-ver-1-22474487139-cd-key-pc-download-catalog-0" data-tp="tpsprite11"></div>
<div class="topclick-list-element-game">
<div class="topclick-list-element-game-title">NieR Replicant ver.1.22474487139</div>
<div class="topclick-list-element-game-merchant">
"""
soup = BeautifulSoup(html_doc, "html.parser")
for div in soup.find_all(class_="topclick-list-element-game-merchant"):
print(div.get_text(strip=True))
印刷:
CDKeys.com
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.