將參數傳遞給回調函數

Question

def parse(self, response):
    for sel in response.xpath('//tbody/tr'):
        item = HeroItem()
        item['hclass'] = response.request.url.split("/")[8].split('-')[-1]
        item['server'] = response.request.url.split('/')[2].split('.')[0]
        item['hardcore'] = len(response.request.url.split("/")[8].split('-')) == 3
        item['seasonal'] = response.request.url.split("/")[6] == 'season'
        item['rank'] = sel.xpath('td[@class="cell-Rank"]/text()').extract()[0].strip()
        item['battle_tag'] = sel.xpath('td[@class="cell-BattleTag"]//a/text()').extract()[1].strip()
        item['grift'] = sel.xpath('td[@class="cell-RiftLevel"]/text()').extract()[0].strip()
        item['time'] = sel.xpath('td[@class="cell-RiftTime"]/text()').extract()[0].strip()
        item['date'] = sel.xpath('td[@class="cell-RiftTime"]/text()').extract()[0].strip()
        url = 'https://' + item['server'] + '.battle.net/' + sel.xpath('td[@class="cell-BattleTag"]//a/@href').extract()[0].strip()

        yield Request(url, callback=self.parse_profile)

def parse_profile(self, response):
    sel = Selector(response)
    item = HeroItem()
    item['weapon'] = sel.xpath('//li[@class="slot-mainHand"]/a[@class="slot-link"]/@href').extract()[0].split('/')[4]
    return item

好吧，我正在主解析方法中抓取整個表，並且我從該表中獲取了幾個字段。 其中一個字段是一個 url，我想探索它以獲得一組全新的字段。 如何將我已經創建的 ITEM 對象傳遞給回調函數，以便最終項目保留所有字段？

正如上面的代碼所示，我可以保存 url 中的字段（目前的代碼）或只保存表中的字段（只需編寫yield item ）但我不能只產生一個對象一起的領域。

我試過這個，但很明顯，它不起作用。

yield Request(url, callback=self.parse_profile(item))

def parse_profile(self, response, item):
    sel = Selector(response)
    item['weapon'] = sel.xpath('//li[@class="slot-mainHand"]/a[@class="slot-link"]/@href').extract()[0].split('/')[4]
    return item

Answer 1

這就是您使用meta關鍵字的目的。

def parse(self, response):
    for sel in response.xpath('//tbody/tr'):
        item = HeroItem()
        # Item assignment here
        url = 'https://' + item['server'] + '.battle.net/' + sel.xpath('td[@class="cell-BattleTag"]//a/@href').extract()[0].strip()

        yield Request(url, callback=self.parse_profile, meta={'hero_item': item})

def parse_profile(self, response):
    item = response.meta.get('hero_item')
    item['weapon'] = response.xpath('//li[@class="slot-mainHand"]/a[@class="slot-link"]/@href').extract()[0].split('/')[4]
    yield item

另請注意，執行sel = Selector(response)是一種資源浪費，並且與您之前所做的不同，因此我對其進行了更改。 它在response自動映射為response.selector ，它還具有response.xpath的便捷快捷方式。

Answer 2

這是將 args 傳遞給回調函數的更好方法：

def parse(self, response):
    request = scrapy.Request('http://www.example.com/index.html',
                             callback=self.parse_page2,
                             cb_kwargs=dict(main_url=response.url))
    request.cb_kwargs['foo'] = 'bar'  # add more arguments for the callback
    yield request

def parse_page2(self, response, main_url, foo):
    yield dict(
        main_url=main_url,
        other_url=response.url,
        foo=foo,
    )

來源： https : //docs.scrapy.org/en/latest/topics/request-response.html#topics-request-response-ref-request-callback-arguments

Answer 3

@peduDev

嘗試了您的方法，但由於意外的關鍵字而失敗。

scrapy_req = scrapy.Request(url=url, 
callback=self.parseDetailPage,
cb_kwargs=dict(participant_id=nParticipantId))


def parseDetailPage(self, response, participant_id ):
    .. Some code here..
    yield MyParseResult (
        .. some code here ..
        participant_id = participant_id
    )

Error reported
, cb_kwargs=dict(participant_id=nParticipantId)
TypeError: _init_() got an unexpected keyword argument 'cb_kwargs'

知道除了舊的scrapy版本之外是什么導致了意外的關鍵字參數？

是的。 我驗證了我自己的建議，升級后一切正常。

須藤 pip install --upgrade scrapy

Answer 4

我在 Tkinter 的額外參數傳遞方面遇到了類似的問題，並發現此解決方案有效（此處： http : //infohost.nmt.edu/tcc/help/pubs/tkinter/web/extra-args.html ），轉換為您的問題：

def parse(self, response):
    item = HeroItem()
    [...]
    def handler(self = self, response = response, item = item):
        """ passing as default argument values """
        return self.parse_profile(response, item)
    yield Request(url, callback=handler)

將參數傳遞給回調函數

問題描述

4 個解決方案

解決方案1
46 2015-08-27 15:01:35

解決方案2
13 2020-02-03 08:30:30

解決方案3
0 2020-05-11 12:52:32

解決方案4
-1 2015-08-27 15:26:38

將參數傳遞給回調函數

問題描述

4 個解決方案

解決方案1 46 2015-08-27 15:01:35

解決方案2 13 2020-02-03 08:30:30

解決方案3 0 2020-05-11 12:52:32

解決方案4 -1 2015-08-27 15:26:38

解決方案1
46 2015-08-27 15:01:35

解決方案2
13 2020-02-03 08:30:30

解決方案3
0 2020-05-11 12:52:32

解決方案4
-1 2015-08-27 15:26:38