简体   繁体   English

网页抓取 NBA 游戏 Python

[英]Web Scraping NBA Games Python

I've been trying to write a code that extracts the names of the teams that play in that given day.我一直在尝试编写一个代码来提取当天参加比赛的球队的名称。 This link ( https://www.nba.com/scores ) shows the games of the day.此链接 ( https://www.nba.com/scores ) 显示了当天的比赛。 Looking through the source code, I noticed that the names are contained in those div tags with class="score-tile__team-name" Print Here .查看源代码,我注意到名称包含在那些带有 class="score-tile__team-name" Print Here 的div 标签中。 I wrote this code, but it returns blank.我写了这段代码,但它返回空白。 Thank you for the help!感谢您的帮助!

from bs4 import BeautifulSoup
import requests

response = requests.get("https://www.nba.com/scores").text

soup = BeautifulSoup(response, 'html.parser')

for div in soup.find_all('div', class_="score-tile__team-name"):
    print(div.text)

You're using the "inspect element" tool, which shows you the page as it currently exists- and after a bunch of javascript is run.您正在使用“检查元素”工具,该工具向您显示当前存在的页面 - 并且在运行了一堆 javascript 之后。 This is different than how the page is initially loaded.这与最初加载页面的方式不同。 If you click view source you'll see the initial page that was loaded, before any javascript was run, which unfortunately does not contain the scores.如果您单击view source您将看到在运行任何 javascript 之前加载的初始页面,不幸的是它不包含分数。

When you run request.get you're getting the initial page without the scores.当您运行request.get您将获得没有分数的初始页面。

What you might want to do as a next step is open up the web tools, select Network , and then reload this page.下一步您可能想要做的是打开网络工具,选择Network ,然后重新加载此页面。 Then look through what gets loaded to see if you can find the file that contains the scores.然后查看加载的内容,看看是否可以找到包含分数的文件。

You acquire that data directly from the direct request.您直接从直接请求中获取该数据。

import requests

jsonData = requests.get('https://data.nba.net/prod/v2/20200224/scoreboard.json').json()

for each in jsonData['games']:
    print ('%s @ %s' %(each['vTeam']['triCode'], each['hTeam']['triCode']))

Output:输出:

MIA @ CLE
ATL @ PHI
MIL @ WAS
ORL @ BKN
NYK @ HOU
MIN @ DAL
PHX @ UTA
MEM @ LAC

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM