簡體   English   中英

Scrapy:如何抓取我從Spider獲得的URL? exceptions.NameError:未定義全局名稱'parse_detail'

[英]Scrapy: how to crawl the URL I got from spider? exceptions.NameError: global name 'parse_detail' is not defined

我練習scrapy並有一個問題:我想再次抓取從Spider獲得的鏈接,不知道該怎么做

這是我的代碼:如您所見,我抓取的鏈接將保存在參數中:movie_descriptionTW_URL
我寫了yield Request(movie_descriptionTW, parse_detail)將結果發送到def:

def parse_detail(self, response):
    print(response.url)

但是有一個錯誤:exceptions.NameError:未定義全局名稱'parse_detail'
如何解決呢?
請教我! 謝謝

from scrapy.spider import Spider
from scrapy.selector import Selector
from yahoo.items import YahooItem
from scrapy.http.request import Request   

class MySpider(Spider):   
    name = "yahoogo"
    start_urls = ["https://tw.movies.yahoo.com/chart.html"]  

    def parse(self, response):
        sel = Selector(response)
        sites = sel.xpath("//tr")
        items = []
        for site in sites:
            item = YahooItem()
            ranking_list = site.xpath("td[@class='c1']/span/text()").extract()
            movie_descriptionTW  = site.xpath("(td[@class='c3']/*//a)[position() < last()-1]/text() | td[@class='c3']/a[1]/text() ").extract()
            movie_descriptionTW_URL = site.xpath("(td[@class='c3']/*//a[2]/@href) | td[@class='c3']/a[1]/@href ").extract()   

            # crawl again!
            yield Request(movie_descriptionTW, parse_detail)

            if ranking_list:    
                items.append(item)
        yield items     

    def parse_detail(self, response):
        print(response.url)

使用self.parse_detail來引用類方法,如下所示:

for url in movie_descriptionTW_URL:
    yield Request(url=url, callback=self.parse_detail)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM