[英]Extract cell values from multi body html table using beautifulsoup
[英]Extract data from a specific cell in a table using BeautifulSoup?
我正在嘗試提取特定醫院的分診等待時間以供其他應用程序使用。 來自所有當地醫院的數據可從以下網址獲得: https://www.health.wa.gov.au/emergencyactivity/EDdata/edsv/
以下是我迄今為止取得的進展:
import requests
from bs4 import BeautifulSoup
URL = 'https://www.health.wa.gov.au/emergencyactivity/EDdata/edsv/'
headers = {
"User-Agent": 'Mozilla/5.0 (X11; Linux x86_64; rv:76.0) Gecko/20100101 Firefox/76.0'
}
page = requests.get(URL, headers=headers)
soup = BeautifulSoup(page.content, 'html.parser')
table_rows = soup.find_all('tr')
for tr in table_rows:
td = tr.find_all('td')
row = [i.text for i in td]
print(row)
我只想提取查爾斯蓋爾德納爵士醫院的分診時間,但不知道該怎么做。 任何幫助將非常感激!
你快到了。 嘗試這樣的事情:
from bs4 import Tag
table_rows = soup.select('tr td')
for tr in table_rows:
if tr.text == 'Sir Charles Gairdner Hospital':
for ns in tr.next_siblings:
if isinstance(ns,Tag):
print(ns.text)
另一種選擇:
table = soup.select('table')[0]
for row in table:
if isinstance(row,Tag):
tds = row.select('td')
if len(tds)>0 and tds[0].text=='Sir Charles Gairdner Hospital':
for td in tds:
print(td.text)
Output:
73
5
36
編輯:要僅打印該位置的分類等待時間,請使用:
for tr in table_rows:
if tr.text == 'Sir Charles Gairdner Hospital':
print(tr.next_sibling.text) #note: it's "next_sibling", not "siblings" this time
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.