[英]Scraping a website which has a table but the next button on the table doesn't change the url
I wanted to scrape this link and get the whole table of players:- https://www.nba.com/stats/leaders/?StatCategory=FG3M&PerMode=Totals&Season=2015-16&SeasonType=Regular%20Season我想抓取此链接并获取整个玩家表:- https://www.nba.com/stats/leaders/?StatCategory=FG3M&PerMode=Totals&Season=2015-16&SeasonType=Regular%20Season
Here, if you click on the next button in the table, the contents of the table changes but the url on the top doesn't change.在这里,如果您单击表格中的下一步按钮,表格的内容会发生变化,但顶部的 url 不会改变。 But the button doesn't have a button tag.但是该按钮没有按钮标签。 It looks like this:-它看起来像这样:-
<a class="stats-table-pagination__next" href="" alt="Next Page" ng-click="nav(1)">
<i class="fa fa-angle-right" aria-hidden="true"></i>
</a>
I tried using beautiful soup and selenium to scrape this website but I can't figure out how to navigate to other pages of the table so that I can scrape them too.我尝试使用漂亮的汤和 selenium 来抓取这个网站,但我不知道如何导航到表格的其他页面,以便我也可以抓取它们。 Please suggest a solution.请提出解决方案。
You can use use google chrome in developer mode and find that json file containing all the data from image that you can see您可以在开发人员模式下使用谷歌浏览器并找到 json 文件,其中包含您可以看到的图像中的所有数据
Then go to Network tab and refresh link and go to xhr tab you will find lots of link from that one link contains players information然后 go 到 Network 选项卡并刷新链接和 go 到 xhr 选项卡你会发现很多链接从一个链接包含球员信息
after getting that exact data click on that link copy address and use requests
module get json data and extract the information获得确切数据后,单击该链接复制地址并使用requests
模块获取 json 数据并提取信息
import requests res=requests.get("https://stats.nba.com/stats/leagueLeaders?LeagueID=00&PerMode=Totals&Scope=S&Season=2015-16&SeasonType=Regular+Season&StatCategory=FG3M") data=res.json() for i in range(len(data['resultSet']['rowSet'])): print(data['resultSet']['rowSet'][i][2])
Output: Output:
Stephen Curry
Klay Thompson
James Harden
Damian Lillard
..
Image:图片:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.