scrapy无法进行Request（）回调

Question

我试图用Scrapy制作递归解析脚本，但是Request()函数不调用回调函数suppose_to_parse() ，也不调用回调值中提供的任何函数。 我尝试了不同的变化，但没有一个工作。 在哪里挖？

from scrapy.http import Request
from scrapy.spider import BaseSpider
from scrapy.selector import HtmlXPathSelector



class joomler(BaseSpider):
    name = "scrapy"
    allowed_domains = ["scrapy.org"]
    start_urls = ["http://blog.scrapy.org/"]


    def parse(self, response):
        print "Working... "+response.url
        hxs = HtmlXPathSelector(response)
        for link in hxs.select('//a/@href').extract():
            if not link.startswith('http://') and not link.startswith('#'):
               url=""
               url=(self.start_urls[0]+link).replace('//','/')
               print url
               yield Request(url, callback=self.suppose_to_parse)


    def suppose_to_parse(self, response):
        print "asdasd"
        print response.url

Answer 1

我不是专家，但我尝试了你的代码，我认为问题不在请求上，生成的url似乎被打破了，如果你将一些url添加到列表并迭代它们并产生带回调的Request ，它工作正常。

Answer 2

将yield移到if语句之外：

for link in hxs.select('//a/@href').extract():
    url = link
    if not link.startswith('http://') and not link.startswith('#'):
        url = (self.start_urls[0] + link).replace('//','/')

    print url
    yield Request(url, callback=self.suppose_to_parse)

scrapy无法进行Request（）回调

问题描述

2 个解决方案

解决方案1
1 2013-03-22 21:56:17

解决方案2
1 已采纳 2013-03-22 22:28:35

scrapy无法进行Request（）回调

问题描述

2 个解决方案

解决方案1 1 2013-03-22 21:56:17

解决方案2 1 已采纳 2013-03-22 22:28:35

解决方案1
1 2013-03-22 21:56:17

解决方案2
1 已采纳 2013-03-22 22:28:35