簡體   English   中英

Scrapy 在嵌套的 json 對象中運行循環

[英]Scrapy run for loop within nested json objects

我正在使用 scrapy 抓取 json api 並希望循環瀏覽報價,然后遍歷結果,如下面的屏幕截圖所示。 我得到了好的報價,但不確定要為 get() 寫什么,因為它沒有標簽。 我嘗試過的所有操作都會導致“列表”錯誤 object has no attribute get。

在此處輸入圖像描述

我的代碼如下:

import scrapy
import json

class DkSpider(scrapy.Spider):
    name = 'dk'
    allowed_domains = ['sportsbook.draftkings.com']
    start_urls = ['https://sportsbook.draftkings.com//sites/US-SB/api/v4/eventgroups/88670846/categories/583/subcategories/4991']

    def parse(self, response):
        items = json.loads(response.body)
        cats = items.get('eventGroup').get('offerCategories')

        for cat in cats:
            groups = str(cat.get('name'))

            if groups == "Player Props":
                subcats = cat.get('offerSubcategoryDescriptors')

                for subcat in subcats:
                    markets = str(subcat.get('name'))

                    if markets == "Points":
                        games = subcat.get('offerSubcategory').get('offers')

                        for game in games:
                            outcomes = game.get('outcomes')

  

如果for game in games[0]:編寫一個空白條目,但由於您有多個空白鍵,那么您需要遍歷它們以獲取所需的所有信息。

用你的方法解決:

import scrapy


class DkSpider(scrapy.Spider):
    name = 'dk'
    allowed_domains = ['sportsbook.draftkings.com']
    start_urls = ['https://sportsbook.draftkings.com//sites/US-SB/api/v4/eventgroups/88670846/categories/583/subcategories/4991']

    def parse(self, response):
        items = response.json()
        cats = items.get('eventGroup').get('offerCategories')

        for cat in cats:
            groups = str(cat.get('name'))

            if groups == "Player Props":
                subcats = cat.get('offerSubcategoryDescriptors')

                for subcat in subcats:
                    markets = str(subcat.get('name'))

                    if markets == "Points":
                        games = subcat.get('offerSubcategory').get('offers')

                        for game in games:
                            for in_game in game:
                                outcomes = in_game.get('outcomes')
                                for outcome in outcomes:
                                    print(outcome['participant'])

但請注意,您執行的迭代次數比實際需要的要多,因此運行時間會更長。 要么休息一下,要么做這樣的事情:

import scrapy
import json


class DkSpider(scrapy.Spider):
    name = 'dk'
    allowed_domains = ['sportsbook.draftkings.com']
    start_urls = ['https://sportsbook.draftkings.com//sites/US-SB/api/v4/eventgroups/88670846/categories/583/subcategories/4991']

    def parse(self, response):
        # from scrapy.shell import inspect_response
        # inspect_response(response, self)
        games = json.loads(response.body)['eventGroup']['offerCategories'][1]['offerSubcategoryDescriptors'][1]['offerSubcategory']['offers']
        for game in games:
            for in_game in game:
                outcomes = in_game.get('outcomes')
                for outcome in outcomes:
                    # Get whatever info you want here
                    print(outcome['participant'])

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM