I am trying to web scrape the second table from this website: https://fbref.com/en/comps/9/stats/Premier-League-Stats However, I have only ever managed to extract the information from the first table when trying to access the information by finding the table tag. Would anyone be able to explain to me why I cannot access the second table or show me how to do it.
import requests
from bs4 import BeautifulSoup
url = "https://fbref.com/en/comps/9/stats/Premier-League-Stats"
res = requests.get(url)
soup = BeautifulSoup(res.text, 'lxml')
pl_table = soup.find_all("table")
player_table = tables[0]
Something along these lines should do it
tables = soup.find_all("table") # returns a list of tables
second_table = tables[1]
The table is inside HTML comments <.--... -->
.
To get the table from comments, you can use this example:
import requests
from bs4 import BeautifulSoup, Comment
url = 'https://fbref.com/en/comps/9/stats/Premier-League-Stats'
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
table = BeautifulSoup(soup.select_one('#all_stats_standard').find_next(text=lambda x: isinstance(x, Comment)), 'html.parser')
#print some information from the table to screen:
for tr in table.select('tr:has(td)'):
tds = [td.get_text(strip=True) for td in tr.select('td')]
print('{:<30}{:<20}{:<10}'.format(tds[0], tds[3], tds[5]))
Prints:
Patrick van Aanholt Crystal Palace 1990
Max Aarons Norwich City 2000
Tammy Abraham Chelsea 1997
Che Adams Southampton 1996
Adrián Liverpool 1987
Sergio Agüero Manchester City 1988
Albian Ajeti West Ham 1997
Nathan Aké Bournemouth 1995
Marc Albrighton Leicester City 1989
Toby Alderweireld Tottenham 1989
...and so on.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.