當 web 使用 bs4 抓取時，我如何 select html 頁面中的特定部分？

Question

當我抓取天氣網站時，有 2 個“部分”。 當我做Humd = soup.select_one('section:-soup-contains("%")').section.text它檢查第一部分，但我想要的信息在第二部分。 如何將 select 設為第二部分，而不是搜索和選擇第一部分？

42%

我將如何獲得 42%？ 我試過如果湯包含 '%' go 到 div，然后是跨度和文本，但它會在早上返回。 代碼如下。

Humd = soup.select_one('section:-soup-contains("%")').div.span.text

網站： https://weather.com/en-GB/weather/today/l/12ad1b2264138ebcb368cc8f5b7435cb276f7cdea8de4cf37f5bd9c22070aa76

https://i.stack.imgur.com/eP0Zb.png https://i.stack.imgur.com/VocDS.png

我也試過Humd = soup.select_one('section2:-soup-contains("%")').div.span.text但它的返回'沒有屬性 div'。

我的代碼https://replit.com/@HarshitJagarlam/DangerousSpitefulCopyright#main.py

Answer 1

您可以通過 id 或 class 來 select：

section = soup.find('section', { 'id': 'section2-id' })

section = soup.find('section', { 'class': 'section2-class' })

Answer 2

嘗試這個

soup.find('span', {'data-testid': 'PercentageValue'}).text

我這里有這個值

順便說一句，這個網站在我的國家被封鎖，我需要用 python 更改我的 ip 來測試這一行，但我還沒有這樣做。

Answer 3

Select 您的元素更具體，並使用包含Humidity的父元素：

soup.select_one('.TodayDetailsCard--detailsContainer--16Hg0 div:-soup-contains("Humidity")').span.text

例子

from bs4 import BeautifulSoup
import requests

headers = {'User-Agent': 'Mozilla/5.0'}

url = 'https://weather.com/en-GB/weather/today/l/12ad1b2264138ebcb368cc8f5b7435cb276f7cdea8de4cf37f5bd9c22070aa76'
soup = BeautifulSoup(requests.get(url, headers=headers).text)

soup.select_one('.TodayDetailsCard--detailsContainer--16Hg0 div:-soup-contains("Humidity")').span.text

Answer 4

以下代碼將可靠地檢索“濕度”旁邊的值：

import requests
from bs4 import BeautifulSoup

url = "https://weather.com/en-GB/weather/today/l/12ad1b2264138ebcb368cc8f5b7435cb276f7cdea8de4cf37f5bd9c22070aa76"

r = requests.get(url)
soup = BeautifulSoup(r.text, 'html.parser')

hum = soup.find('div', string='Humidity').next_sibling
print(hum.text)

結果：

54%

beautifulSoup 的文檔可在https://www.crummy.com/software/BeautifulSoup/bs4/doc/#找到

當 web 使用 bs4 抓取時，我如何 select html 頁面中的特定部分？

問題描述

4 個解決方案

解決方案1
0 2022-08-07 21:50:27

解決方案2
0 2022-08-08 08:39:16

解決方案3
0 2022-08-08 08:43:38

例子

解決方案4
0 2022-08-08 08:46:59

當 web 使用 bs4 抓取時，我如何 select html 頁面中的特定部分？

問題描述

4 個解決方案

解決方案1 0 2022-08-07 21:50:27

解決方案2 0 2022-08-08 08:39:16

解決方案3 0 2022-08-08 08:43:38

例子

解決方案4 0 2022-08-08 08:46:59

解決方案1
0 2022-08-07 21:50:27

解決方案2
0 2022-08-08 08:39:16

解決方案3
0 2022-08-08 08:43:38

解決方案4
0 2022-08-08 08:46:59