Can't find <div ng-view> from NBA stats website with BeautifulSoup Python

Question

I'm trying to Scrape this NBA website https://stats.nba.com/team/1610612738/ . What I'm trying to do is to extract player's name, NO, POS and all the information for every player. The problem is that I can't find or my code can't find <div ng-view> that's the parent of <nba-stat-table > where the table is.

My code so far is:

from selenium import webdriver
from bs4 import BeautifulSoup

def get_Player():
    driver = webdriver.PhantomJS(executable_path=r'D:\Documents\Python\Web Scraping\phantomjs.exe')

    url = 'https://stats.nba.com/team/1610612738/'

    driver.get(url)

    data = driver.page_source.encode('utf-8')

    soup = BeautifulSoup(data, 'lxml')

    div1 = soup.find('div', class_="columns / small-12 / section-view-overlay")
    print(div1.find_all('div'))

get_Player()

Answer 1

Use the json response endpoint which the page uses to get that content. Far easier and nicer to handle and no need for selenium. You can find it in the network tab.

import requests
import pandas as pd

r = requests.get('https://stats.nba.com/stats/commonteamroster?LeagueID=00&Season=2018-19&TeamID=1610612738',  headers = {'User-Agent' : 'Mozilla/5.0'}).json()
players_info = r['resultSets'][0]
df = pd.DataFrame(players_info['rowSet'], columns = players_info['headers'])
print(df.head())

Answer 2

find_all function always return a list, findChildren() is return all child of tag object, more details

Replace your code:

div1 = soup.find('div', class_="columns / small-12 / section-view-overlay")
print(div1.find_all('div'))

To:

div = soup.find('div', {'class':"nba-stat-table__overflow"})
for tr in div.find("tbody").find_all("tr"):
    for td in tr.findChildren():
        print(td.text)

UPDATE:

from selenium import webdriver

from bs4 import BeautifulSoup
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

def get_Player():
    driver = webdriver.PhantomJS(executable_path=r'D:\Documents\Python\Web Scraping\phantomjs.exe')

    url = 'https://stats.nba.com/team/1610612738/'

    driver.get(url)

    WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CLASS_NAME, "nba-stat-table__overflow")))

    data = driver.page_source.encode('utf-8')

    soup = BeautifulSoup(data, 'lxml')

    div = soup.find('div', {'class':"nba-stat-table__overflow"})
    for tr in div.find("tbody").find_all("tr"):
        for td in tr.findChildren():
            print(td.text)

get_Player()

O/P:

Jayson Tatum
Jayson Tatum
#0
F
6-8
208 lbs
MAR 03, 1998
21
1
Duke
Jonathan Gibson
Jonathan Gibson
#3
G
6-2
185 lbs
NOV 08, 1987
31
2
New Mexico State
....

Answer 3

Why do you want to find all the div's , If it's just the Player name that you want to extract, you can use this css selector :

td.player a

Code :

all_player = driver.find_elements_by_css_selector('td.player a')
for playername in all_player:
   print(playername.text)

Can't find <div ng-view> from NBA stats website with BeautifulSoup Python

Question

3 answers

solution1
2 ACCPTED 2019-06-06 06:52:02

solution2
1 2019-06-06 05:02:22

solution3
0 2019-06-06 04:22:25

Can't find <div ng-view> from NBA stats website with BeautifulSoup Python

Question

3 answers

solution1 2 ACCPTED 2019-06-06 06:52:02

solution2 1 2019-06-06 05:02:22

solution3 0 2019-06-06 04:22:25

solution1
2 ACCPTED 2019-06-06 06:52:02

solution2
1 2019-06-06 05:02:22

solution3
0 2019-06-06 04:22:25