[英]Scrapy run for loop within nested json objects
我正在使用 scrapy 抓取 json api 并希望循环浏览报价,然后遍历结果,如下面的屏幕截图所示。 我得到了好的报价,但不确定要为 get() 写什么,因为它没有标签。 我尝试过的所有操作都会导致“列表”错误 object has no attribute get。
我的代码如下:
import scrapy
import json
class DkSpider(scrapy.Spider):
name = 'dk'
allowed_domains = ['sportsbook.draftkings.com']
start_urls = ['https://sportsbook.draftkings.com//sites/US-SB/api/v4/eventgroups/88670846/categories/583/subcategories/4991']
def parse(self, response):
items = json.loads(response.body)
cats = items.get('eventGroup').get('offerCategories')
for cat in cats:
groups = str(cat.get('name'))
if groups == "Player Props":
subcats = cat.get('offerSubcategoryDescriptors')
for subcat in subcats:
markets = str(subcat.get('name'))
if markets == "Points":
games = subcat.get('offerSubcategory').get('offers')
for game in games:
outcomes = game.get('outcomes')
如果for game in games[0]:
编写一个空白条目,但由于您有多个空白键,那么您需要遍历它们以获取所需的所有信息。
用你的方法解决:
import scrapy
class DkSpider(scrapy.Spider):
name = 'dk'
allowed_domains = ['sportsbook.draftkings.com']
start_urls = ['https://sportsbook.draftkings.com//sites/US-SB/api/v4/eventgroups/88670846/categories/583/subcategories/4991']
def parse(self, response):
items = response.json()
cats = items.get('eventGroup').get('offerCategories')
for cat in cats:
groups = str(cat.get('name'))
if groups == "Player Props":
subcats = cat.get('offerSubcategoryDescriptors')
for subcat in subcats:
markets = str(subcat.get('name'))
if markets == "Points":
games = subcat.get('offerSubcategory').get('offers')
for game in games:
for in_game in game:
outcomes = in_game.get('outcomes')
for outcome in outcomes:
print(outcome['participant'])
但请注意,您执行的迭代次数比实际需要的要多,因此运行时间会更长。 要么休息一下,要么做这样的事情:
import scrapy
import json
class DkSpider(scrapy.Spider):
name = 'dk'
allowed_domains = ['sportsbook.draftkings.com']
start_urls = ['https://sportsbook.draftkings.com//sites/US-SB/api/v4/eventgroups/88670846/categories/583/subcategories/4991']
def parse(self, response):
# from scrapy.shell import inspect_response
# inspect_response(response, self)
games = json.loads(response.body)['eventGroup']['offerCategories'][1]['offerSubcategoryDescriptors'][1]['offerSubcategory']['offers']
for game in games:
for in_game in game:
outcomes = in_game.get('outcomes')
for outcome in outcomes:
# Get whatever info you want here
print(outcome['participant'])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.