I am very new to coding and I am trying to build a web scraper for Excel so that I can transfer it to Google Sheets. Unfortunately, the code that I have written is working for other people, but not me.
This is the code I have written:
import requests
from bs4 import BeautifulSoup, Comment
import pandas as pd
URL = 'https://www.hockey-reference.com/leagues/NHL_2021.html'
csv_name = 'nhl_season_stats.csv'
def get_nhl_stats(URL):
headers = {'User-Agent':'Mozilla/5.0 (X11; Linux x86_64)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36'}
pageTree = requests.get(URL, headers=headers)
pageSoup = BeautifulSoup(pageTree.content, 'html.parser')
comments = pageSoup.find_all(string=lambda text: isinstance(text, Comment))
tables = []
for each in comments:
if 'table' in each:
try:
tables.append(pd.read_html(each, header=1)[0])
except:
continue
df = tables[0]
df = df.rename(columns={'Unnamed: 1':'Team'})
df.to_csv(csv_name, index = False)
print(df)
get_nhl_stats(URL)
After running it, I receive this error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 13, in get_nhl_stats
IndexError: list index out of range
Sorry for my bad jargon, as I am very new and very confused, but any help would be greatly appreciated!
this code working, maybe the problem is in the declaration of the class "Comment" or the server does not give you the requested values:
import requests
from bs4 import BeautifulSoup
import pandas as pd
URL = 'https://www.hockey-reference.com/leagues/NHL_2021.html'
csv_name = 'nhl_season_stats.csv'
def get_nhl_stats(URL):
headers = {'User-Agent':'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36'}
pageTree = requests.get(URL, headers=headers)
pageSoup = BeautifulSoup(pageTree.content, 'html.parser')
comments = pageSoup.find_all(string=lambda text: isinstance(text, str))
tables = []
for each in comments:
if 'table' in each:
try:
tables.append(pd.read_html(each, header=1)[0])
except:
continue
df = tables[0]
df = df.rename(columns={'Unnamed: 1':'Team'})
df.to_csv(csv_name, index = False)
print(df)
get_nhl_stats(URL)
output:
Rk Team AvAge GP W L OL PTS PTS% GF GA SOW SOL SRS SOS TG/G EVGF EVGA PP PPO PP% PPA PPOA PK% SH SHA PIM/G oPIM/G S S% SA SV% SO
0 1.0 Toronto Maple Leafs 29.0 6 4 2 0 8 0.667 19 17 0.0 0.0 0.33 -0.01 6.00 11 12 8 18 44.44 4 22 81.82 0 1 10.5 7.5 190 10.0 157 0.892 0
1 2.0 Montreal Canadiens 28.6 5 3 0 2 8 0.800 24 15 0.0 1.0 0.77 -0.83 7.80 14 8 6 20 30.00 6 25 76.00 4 1 11.4 10.6 180 13.3 140 0.893 0
2 3.0 Vegas Golden Knights 28.9 5 4 1 0 8 0.800 18 12 0.0 0.0 1.12 -0.08 6.00 15 8 2 18 11.11 3 18 83.33 1 1 7.2 7.2 150 12.0 125 0.904 0
3 4.0 Minnesota Wild 29.1 5 4 1 0 8 0.800 15 10 0.0 0.0 0.86 -0.14 5.00 13 9 1 23 4.35 1 16 93.75 1 0 7.6 10.4 166 9.0 147 0.932 0
4 5.0 Washington Capitals 30.1 5 3 0 2 8 0.800 18 16 1.0 1.0 0.10 -0.30 6.80 16 12 2 9 22.22 3 18 83.33 0 1 8.6 5.0 130 13.8 141 0.887 0
5 6.0 Philadelphia Flyers 27.0 5 3 1 1 7 0.700 19 15 0.0 1.0 0.36 -0.24 6.80 14 10 5 17 29.41 5 18 72.22 0 0 7.2 6.8 125 15.2 187 0.920 1
6 7.0 Colorado Avalanche 26.9 5 3 2 0 6 0.600 17 12 0.0 0.0 0.47 -0.53 5.80 7 9 10 25 40.00 3 19 84.21 0 0 8.0 10.4 147 11.6 143 0.916 1
7 8.0 Winnipeg Jets 27.9 4 3 1 0 6 0.750 13 10 0.0 0.0 1.10 0.35 5.75 11 6 2 20 10.00 4 12 66.67 0 0 10.3 14.3 119 10.9 134 0.925 0
8 9.0 New York Islanders 28.9 4 3 1 0 6 0.750 9 6 0.0 0.0 0.61 -0.14 3.75 5 5 4 20 20.00 1 15 93.33 0 0 11.5 11.0 108 8.3 114 0.947 2
9 10.0 Tampa Bay Lightning 27.7 3 3 0 0 6 1.000 13 5 0.0 0.0 1.70 -0.97 6.00 11 2 2 8 25.00 3 11 72.73 0 0 9.0 7.0 107 12.1 85 0.941 0
10 11.0 Pittsburgh Penguins 28.6 5 3 2 0 6 0.600 16 21 2.0 0.0 -0.43 0.17 7.40 10 16 5 18 27.78 5 19 73.68 1 0 7.6 7.2 152 10.5 130 0.838 0
11 12.0 New Jersey Devils 26.2 4 2 1 1 5 0.625 9 10 0.0 1.0 -0.35 0.15 4.75 8 3 1 11 9.09 6 16 62.50 0 1 9.8 7.3 112 8.0 150 0.933 0
12 13.0 St. Louis Blues 28.3 4 2 1 1 5 0.625 10 14 0.0 1.0 -1.66 -0.41 6.00 10 6 0 14 0.00 8 21 61.90 0 0 11.0 7.5 109 9.2 129 0.891 0
13 14.0 Boston Bruins 28.8 4 2 1 1 5 0.625 7 9 2.0 0.0 0.07 0.07 4.00 3 7 3 13 23.08 2 18 88.89 1 0 11.3 8.8 135 5.2 96 0.906 0
14 15.0 Arizona Coyotes 28.4 5 2 2 1 5 0.500 17 17 0.0 1.0 -0.04 0.16 6.80 11 11 5 22 22.73 5 24 79.17 1 1 10.4 9.6 144 11.8 157 0.892 0
15 16.0 Calgary Flames 28.1 3 2 0 1 5 0.833 11 6 0.0 0.0 1.14 -0.52 5.67 5 4 6 16 37.50 1 12 91.67 0 1 8.7 11.3 93 11.8 93 0.935 1
16 17.0 Edmonton Oilers 27.9 6 2 4 0 4 0.333 15 20 0.0 0.0 -0.91 -0.08 5.83 10 14 3 23 13.04 4 18 77.78 2 2 7.7 9.3 192 7.8 200 0.900 0
17 18.0 Vancouver Canucks 27.3 6 2 4 0 4 0.333 17 28 1.0 0.0 -1.34 0.33 7.50 12 17 4 26 15.38 9 31 70.97 1 2 13.3 10.7 179 9.5 222 0.874 0
18 19.0 Anaheim Ducks 28.6 5 1 2 2 4 0.400 8 13 0.0 0.0 -0.10 0.90 4.20 8 10 0 12 0.00 2 15 86.67 0 1 6.4 5.2 133 6.0 160 0.919 1
19 20.0 Columbus Blue Jackets 26.6 5 1 2 2 4 0.400 10 16 0.0 0.0 -1.19 0.01 5.20 9 15 1 11 9.09 1 10 90.00 0 0 9.0 9.4 152 6.6 169 0.905 0
20 21.0 Los Angeles Kings 28.3 4 1 1 2 4 0.500 12 13 0.0 0.0 0.43 0.68 6.25 8 10 4 17 23.53 3 21 85.71 0 0 11.0 9.0 119 10.1 121 0.893 0
21 22.0 Detroit Red Wings 29.3 5 2 3 0 4 0.400 10 14 0.0 0.0 -1.54 -0.74 4.80 9 9 1 12 8.33 4 16 75.00 0 1 11.4 9.8 130 7.7 155 0.910 0
22 23.0 San Jose Sharks 29.4 5 2 3 0 4 0.400 12 18 2.0 0.0 -1.32 -0.52 6.00 7 16 5 21 23.81 2 18 88.89 0 0 8.4 9.6 162 7.4 148 0.878 0
23 24.0 Carolina Hurricanes 27.0 3 2 1 0 4 0.667 9 6 0.0 0.0 0.26 -0.74 5.00 6 5 3 12 25.00 1 9 88.89 0 0 7.7 9.7 98 9.2 68 0.912 1
24 25.0 Florida Panthers 27.8 2 2 0 0 4 1.000 10 6 0.0 0.0 1.29 -0.71 8.00 7 3 3 8 37.50 3 5 40.00 0 0 5.0 8.0 66 15.2 66 0.909 0
25 26.0 Nashville Predators 28.7 4 2 2 0 4 0.500 10 14 0.0 0.0 0.01 1.01 6.00 9 7 1 16 6.25 6 16 62.50 0 1 8.0 8.0 135 7.4 126 0.889 0
26 27.0 Buffalo Sabres 27.2 5 1 3 1 3 0.300 14 15 0.0 1.0 -0.18 0.22 5.80 11 14 3 17 17.65 1 6 83.33 0 0 3.8 8.2 161 8.7 133 0.887 0
27 28.0 New York Rangers 25.6 4 1 2 1 3 0.375 11 11 0.0 1.0 -0.15 0.11 5.50 7 7 4 21 19.05 4 16 75.00 0 0 8.5 14.0 140 7.9 112 0.902 1
28 29.0 Chicago Blackhawks 26.9 5 1 3 1 3 0.300 13 21 0.0 0.0 -0.43 1.17 6.80 5 16 7 17 41.18 5 20 75.00 1 0 8.0 6.8 154 8.4 167 0.874 0
29 30.0 Ottawa Senators 27.0 4 1 2 1 3 0.375 11 14 0.0 0.0 -0.04 0.71 6.25 8 10 3 18 16.67 4 21 80.95 0 0 14.3 15.3 113 9.7 120 0.883 0
30 31.0 Dallas Stars 28.8 1 1 0 0 2 1.000 7 0 0.0 0.0 7.30 0.30 7.00 1 0 5 8 62.50 0 5 100.00 1 0 10.0 16.0 28 25.0 34 1.000 1
31 NaN League Average 28.0 4 2 2 1 5 0.574 13 13 NaN NaN NaN NaN 5.94 9 9 4 16 21.33 4 16 78.67 0 0 8.0 8.0 133 9.8 133 0.902 0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.