[英]Using python scrapy based crawler but getting error
嗨,大家好,我用python編寫了一個抓取工具,用於抓取……
import scrapy
from c2.items import C2Item
try:
class C2(scrapy.Spider):
name = 'cn'
allowed_domains = ['priceraja.com']
start_urls = ['https://www.priceraja.com']
def parse_item(self, response):
Item = []
Item['url']=response.xpath('//a/@href/text()').extract()
yield Item
except Exception:
logging.exception("message")
我不斷收到NotImplemented錯誤
2017-08-05 01:12:28 [scrapy.core.scraper] ERROR: Spider error processing
<GET
https://www.killerfeatures.com> (referer: None)
Traceback (most recent call last):
File "D:\Ana\lib\site-packages\twisted\internet\defer.py", line 653, in _
runCallbacks
current.result = callback(current.result, *args, **kw)
File "D:\Ana\lib\site-packages\scrapy\spiders\__init__.py", line 90, in
parse raise NotImplementedError
NotImplementedError
2017-08-05 01:12:28 [scrapy.core.engine] INFO: Closing spider (finished)
2017-08-05 01:12:28 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 435,
'downloader/request_count': 2,
'downloader/request_method_count/GET': 2,
'downloader / response_bytes':9282,'downloader / response_count':2,'downloader / response_status_count / 301':1,'downloader / response_status_count / 301':1,'finish_reason':'finished','finish_time':日期時間。 datetime(2017,8,4,19,42,28,837000),'log_count / DEBUG':3,'log_count / ERROR':1,'log_count / INFO':7,'response_received_count':1,'調度程序/已出隊'':2,'已調度/已出隊/內存':2,'已調度/已入隊':2,'已調度/已入隊/內存':2,'spider_exceptions / NotImplementedError':1,'start_time':datetime.datetime(2017 ,8,4,19,42,25,976000)} 2017-08-05 01:12:28 [scrapy.core.engine]信息:蜘蛛關閉(已完成)
在實現parse_item函數的同時,Scrapy正在尋找parse函數。 將parse_item更改為parse可能可行,或者您可以覆蓋parse函數。
這里的另一個解決方案是使用CrawlSpider
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.