简体   繁体   English

Web 抓取 beautifulSoup 和 selenium NBA 数据

[英]Web scraping with beautifulSoup and selenium NBA stats

Im trying to get the data for the NBA advanced stat but keep getting errors.我试图获取 NBA 高级统计数据,但不断出错。 This is what I have.这就是我所拥有的。 Please help请帮忙

from selenium import webdriver
from bs4 import BeautifulSoup as soup 
d = webdriver.Chrome('C:/chromedriver.exe')
d.get('https://www.nba.com/stats/players/passing/?Season=2019-20&SeasonType=Regular%20Season&TeamID=1610612747')
s = soup(d.page_source, 'html.parser').find('table', {'class':'nba-stat-table__overflow'})
headers, [_, *data] = [i.text for i in soup.find_all('th')], [[i.text for i in soup.find_all('td')] for i in soup.find_all('tr')]
final_data = [i for i in data if len(i) > 1]
print(final_data)

There is no 'table' with 'class'='table-responsive', there is a 'div' element with 'class' = 'table-responsive', and that has a 'table' underneath with 'class'='table'.没有'class'='table-responsive'的'table',有一个'class'='table-responsive'的'div'元素,并且下面有一个'class'='table'的'table' '. So this line is returning a NoneType:所以这一行返回一个 NoneType:

s = soup(d.page_source, 'html.parser').find('table', {'class':'table-responsive'})

Just use pandas to read in the tables once you get the page source from selenium. Note you'll likely need to add a implicit wait for the page to render.从 selenium 获取页面源后,只需使用 pandas 读取表格。请注意,您可能需要添加隐式等待页面呈现。

import pandas as pd
from selenium import webdriver

d = webdriver.Chrome('C:/Users/kgrab/OneDrive/Desktop/web mining/week3-twitter/chromedriver_win32/chromedriver.exe')
d.get('https://www.nba.com/stats/players/passing/?Season=2019-20&SeasonType=Regular%20Season&TeamID=1610612747')

df = pd.read_html(d.page_source)[0]
d.close()

Output: Output:

print (df)
                      Player Team  GP  ...  ASTAdj  AST ToPass%  AST ToPass% Adj
0                Alex Caruso  LAL  64  ...     2.3          8.0              9.6
1              Anthony Davis  LAL  62  ...     3.9          8.3              9.9
2              Avery Bradley  LAL  49  ...     1.6          6.8              8.4
3                Danny Green  LAL  68  ...     1.7          4.7              5.8
4             Devontae Cacok  LAL   1  ...     1.0         16.7             16.7
5               Dion Waiters  LAL   7  ...     3.6          9.8             14.5
6              Dwight Howard  LAL  69  ...     0.8          2.8              3.2
7                   JR Smith  LAL   6  ...     0.7          6.0              8.0
8               JaVale McGee  LAL  68  ...     0.7          3.3              4.4
9               Jared Dudley  LAL  45  ...     0.8          5.7              7.0
10  Kentavious Caldwell-Pope  LAL  69  ...     1.8          7.5              8.6
11      Kostas Antetokounmpo  LAL   5  ...     0.6         11.8             17.6
12                Kyle Kuzma  LAL  61  ...     1.6          6.4              7.9
13              LeBron James  LAL  67  ...    11.8         16.4             19.0
14           Markieff Morris  LAL  14  ...     0.8          4.1              5.7
15                Quinn Cook  LAL  44  ...     1.1          8.3              8.5
16               Rajon Rondo  LAL  48  ...     6.1         12.7             15.5
17       Talen Horton-Tucker  LAL   6  ...     2.0          5.7             11.3
18              Troy Daniels  DEN  41  ...     0.5          6.3              9.0
19          Zach Norvell Jr.  GSW   2  ...     0.0          0.0              0.0

[20 rows x 16 columns]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM