簡體   English   中英

Web 刮一頁多表

[英]Web Scraping a page with multiple tables

我正在嘗試 web 從該網站刮取第二張表: https://fbref.com/en/comps/9/stats/Premier-League-Stats但是,我只設法從第一個表中提取信息時試圖通過查找表標簽來訪問信息。 誰能向我解釋為什么我無法訪問第二張桌子或告訴我如何去做。

import requests 
from bs4 import BeautifulSoup
url = "https://fbref.com/en/comps/9/stats/Premier-League-Stats"
res = requests.get(url)
soup = BeautifulSoup(res.text, 'lxml')
pl_table = soup.find_all("table")  
player_table = tables[0]

沿着這些路線做的事情應該做

tables = soup.find_all("table")  # returns a list of tables
second_table = tables[1]

該表位於 HTML 注釋<.--... -->內。

要從評論中獲取表格,您可以使用以下示例:

import requests
from bs4 import BeautifulSoup, Comment


url = 'https://fbref.com/en/comps/9/stats/Premier-League-Stats'
soup = BeautifulSoup(requests.get(url).content, 'html.parser')

table = BeautifulSoup(soup.select_one('#all_stats_standard').find_next(text=lambda x: isinstance(x, Comment)), 'html.parser')

#print some information from the table to screen:
for tr in table.select('tr:has(td)'):
    tds = [td.get_text(strip=True) for td in tr.select('td')]
    print('{:<30}{:<20}{:<10}'.format(tds[0], tds[3], tds[5]))

印刷:

Patrick van Aanholt           Crystal Palace      1990      
Max Aarons                    Norwich City        2000      
Tammy Abraham                 Chelsea             1997      
Che Adams                     Southampton         1996      
Adrián                        Liverpool           1987      
Sergio Agüero                 Manchester City     1988      
Albian Ajeti                  West Ham            1997      
Nathan Aké                    Bournemouth         1995      
Marc Albrighton               Leicester City      1989      
Toby Alderweireld             Tottenham           1989      

...and so on.

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM