[英]How to find Specific item in <script> using BS4 web scraping
[英]How do I select a specific section in a html page when web scraping using bs4?
當我抓取天氣網站時,有 2 個“部分”。 當我做Humd = soup.select_one('section:-soup-contains("%")').section.text
它檢查第一部分,但我想要的信息在第二部分。 如何將 select 設為第二部分,而不是搜索和選擇第一部分?
我將如何獲得 42%? 我試過如果湯包含 '%' go 到 div,然后是跨度和文本,但它會在早上返回。 代碼如下。
Humd = soup.select_one('section:-soup-contains("%")').div.span.text
https://i.stack.imgur.com/eP0Zb.png https://i.stack.imgur.com/VocDS.png
我也試過Humd = soup.select_one('section2:-soup-contains("%")').div.span.text
但它的返回'沒有屬性 div'。
我的代碼https://replit.com/@HarshitJagarlam/DangerousSpitefulCopyright#main.py
您可以通過 id 或 class 來 select:
section = soup.find('section', { 'id': 'section2-id' })
section = soup.find('section', { 'class': 'section2-class' })
Select 您的元素更具體,並使用包含Humidity的父元素:
soup.select_one('.TodayDetailsCard--detailsContainer--16Hg0 div:-soup-contains("Humidity")').span.text
from bs4 import BeautifulSoup
import requests
headers = {'User-Agent': 'Mozilla/5.0'}
url = 'https://weather.com/en-GB/weather/today/l/12ad1b2264138ebcb368cc8f5b7435cb276f7cdea8de4cf37f5bd9c22070aa76'
soup = BeautifulSoup(requests.get(url, headers=headers).text)
soup.select_one('.TodayDetailsCard--detailsContainer--16Hg0 div:-soup-contains("Humidity")').span.text
以下代碼將可靠地檢索“濕度”旁邊的值:
import requests
from bs4 import BeautifulSoup
url = "https://weather.com/en-GB/weather/today/l/12ad1b2264138ebcb368cc8f5b7435cb276f7cdea8de4cf37f5bd9c22070aa76"
r = requests.get(url)
soup = BeautifulSoup(r.text, 'html.parser')
hum = soup.find('div', string='Humidity').next_sibling
print(hum.text)
結果:
54%
beautifulSoup 的文檔可在https://www.crummy.com/software/BeautifulSoup/bs4/doc/#找到
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.