[英]Scrapy shell - 'fetch' is not defined
嘗試使用Scrapy shell fetch
命令
from scrapy.shell import inspect_response
inspect_response(response, self)
fetch Traceback(最近一次調用last):NameError中的文件“”,第1行:name'fetch'未定義
shelp()
[s] Available Scrapy objects:
[s] scrapy scrapy module (contains scrapy.Request, scrapy.Selector, etc)
[s] crawler <scrapy.crawler.Crawler object at 0x10b23ecd0>
[s] item {}
[s] request <GET https://inventory.dealersocket.com/admin/inventory/current>
[s] response <200 https://inventory.dealersocket.com/admin/inventory/current>
[s] settings <scrapy.settings.Settings object at 0x10b23ec50>
[s] Useful shortcuts:
[s] shelp() Shell help (print this help)
[s] view(response) View response in a browser
如您所見,沒有fetch命令。
問題 - 如何從Scrapy shell做請求?
Fetch只能通過scrapy shell
命令獲得。 它在爬行期間不可用,因為scrapy引擎已經忙於爬行蜘蛛,所以fetch不適合。
但是,可以通過將高優先級請求安排到臨時回調來共同破解它:
class MySpider(Spider):
name = 'myspider'
def parse(self, response):
from scrapy.shell import inspect_response
inspect_response(response, self)
def _fetch_parse(self, response):
from scrapy.shell import inspect_response
inspect_response(response, self)
def fetch(url):
# schedule high priority requests directly
crawler.engine.schedule(Request(url, self._fetch_parse, priority=1000))
當提示parse
shell時,您可以:
crawler.spider.fetch('http://stackoverflow')
# ctrl+d to exit and wait for fetch to trigger.
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.