使用 BeautifulSoup 從表中的特定單元格中提取數據？

Question

我正在嘗試提取特定醫院的分診等待時間以供其他應用程序使用。 來自所有當地醫院的數據可從以下網址獲得： https://www.health.wa.gov.au/emergencyactivity/EDdata/edsv/

以下是我迄今為止取得的進展：

import requests
from bs4 import BeautifulSoup

URL = 'https://www.health.wa.gov.au/emergencyactivity/EDdata/edsv/'

headers = {
    "User-Agent": 'Mozilla/5.0 (X11; Linux x86_64; rv:76.0) Gecko/20100101 Firefox/76.0'
}

page = requests.get(URL, headers=headers)

soup = BeautifulSoup(page.content, 'html.parser')

table_rows = soup.find_all('tr')

for tr in table_rows:
    td = tr.find_all('td')
    row = [i.text for i in td]
    print(row)

我只想提取查爾斯蓋爾德納爵士醫院的分診時間，但不知道該怎么做。 任何幫助將非常感激！

Answer 1

你快到了。 嘗試這樣的事情：

from bs4 import Tag

table_rows = soup.select('tr td')
for tr in table_rows:
    if tr.text ==  'Sir Charles Gairdner Hospital':
            for ns in tr.next_siblings:            
                if isinstance(ns,Tag):
                    print(ns.text)

另一種選擇：

table = soup.select('table')[0]
for row in table:
        if isinstance(row,Tag):
            tds = row.select('td')
            if len(tds)>0 and tds[0].text=='Sir Charles Gairdner Hospital':
                    for td in tds: 
                        print(td.text)

Output：

73
5
36

編輯：要僅打印該位置的分類等待時間，請使用：

for tr in table_rows:
    if tr.text ==  'Sir Charles Gairdner Hospital':
            print(tr.next_sibling.text) #note: it's "next_sibling", not "siblings" this time

使用 BeautifulSoup 從表中的特定單元格中提取數據？

問題描述

1 個解決方案

解決方案1
0 已采納 2020-05-09 16:03:15

使用 BeautifulSoup 從表中的特定單元格中提取數據？

問題描述

1 個解決方案

解決方案1 0 已采納 2020-05-09 16:03:15

解決方案1
0 已采納 2020-05-09 16:03:15