简体   繁体   English

Scrapy:如何抓取从下拉列表中选择的多个网页

[英]Scrapy: How to scrape through multiple webpages selected from dropdown

How would I go about scraping through multiple pages of one website that are navigated through a dropdown and button with scrapy? 我将如何在一个网站的多个页面上进行抓取,这些页面在一个下拉菜单和一个带有抓取的按钮中导航? I know how to do pagination by finding the link to the next page in the current page, but the same technique cannot be applied for this situation. 我知道如何通过找到指向当前页面下一页的链接来进行分页,但是对于这种情况不能使用相同的技术。

One idea I've thought of is finding the value of the next dropdown option and changing the url to direct to the respective page. 我想到的一个想法是找到下一个下拉选项的值,并将URL更改为直接指向相应页面。 Would this be a valid solution? 这将是一个有效的解决方案吗?

Here's an example: http://highschoolsports.nj.com/football/standings/?grouping=15764 这是一个示例: http : //highschoolsports.nj.com/football/standings/?grouping=15764

Firstly visit one of the pages about teams. 首先,访问有关团队的页面之一。 Such as GMC Blue then get all the value s from dropdown selects. 例如GMC Blue,然后从下拉选择中获取所有value s。

 <option value="">Select a Conference - Division</option>
 <option value="15764" selected="selected">GMC - Blue</option>
 <option value="15767">GMC - Red</option>
 <option value="15713">GMC - White</option>
 <option value="18380">Independent</option>
 <option value="15773">Mid-State 38 - Delaware</option>
 <option value="15854">Mid-State 38 - Mountain</option>
 <option value="15824">Mid-State 38 - Raritan</option>
  ....

http://highschoolsports.nj.com/football/standings/?grouping=18380 create a for loop and change value of grouping in each request. http://highschoolsports.nj.com/football/standings/?grouping=18380创建一个for循环并在每个请求中更改分组的值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM