有沒有辦法從 Python 中有多個表的網頁返回一個特定的表？

Question

我無法從此網頁返回一張特定表格（標題為“BRN 大股東”的表格）- https://www.intelligentinvestor.com.au/shares/asx-brn/brainchip-holdings-ltd

我可以使用以下代碼返回所有表格。

html = driver.page_source
soup = BeautifulSoup(html, 'html.parser')
all_tables = soup.find_all('table')

我嘗試了兩種不同的方法來嘗試使用 bs 進行抓取，但我似乎找不到方法 - 我做錯了什么嗎？ 這兩個 output 都是一個空列表。

方法一

# Scrape the substantial holder list
html = driver.page_source
soup = BeautifulSoup(html, 'html.parser')

sub_headers = []
sub_holdings = []

for found_table in soup.find_all('table', class_=f'{ticker_code} + "Substantial Shareholders"'):
    sub_headers = found_table.find_all('th').append(sub_headers)
    sub_holdings = found_table.find_all('td').append(sub_holdings)

print(sub_headers)
print(sub_holdings)

方法二

# Scrape the substantial holder list
html = driver.page_source
soup = BeautifulSoup(html, 'html.parser')

all_headers = soup.find_all("th", class_=f"{ticker_code} Substantial Shareholders")
all_holdings = soup.find_all("tr", class_=f"{ticker_code} Substantial Shareholders")

sub_headers = []
sub_holdings = []

for header in all_headers:
    sub_headers.append(header.text)

for holding in all_holdings:
    holding.append(sub_holdings.text)

print(sub_headers)
print(sub_holdings)

Answer 1

要只抓取帶有“BRN 大股東”字樣的表格，您可以使用 CSS 選擇器找到該表格：

table = soup.select_one("div:nth-of-type(11) table")

Answer 2

在下面找到了一種更簡單的方法。 sub_table = pd.read_html(current_url, match='Holding') print(sub_table)

有沒有辦法從 Python 中有多個表的網頁返回一個特定的表？

問題描述

2 個解決方案

解決方案1
0 已采納 2022-02-08 01:31:16

解決方案2
0 2022-02-08 02:36:25

有沒有辦法從 Python 中有多個表的網頁返回一個特定的表？

問題描述

2 個解決方案

解決方案1 0 已采納 2022-02-08 01:31:16

解決方案2 0 2022-02-08 02:36:25

解決方案1
0 已采納 2022-02-08 01:31:16

解決方案2
0 2022-02-08 02:36:25