[英]traceback when trying to set up the Michael Nielsen mnist network
[英]Scrapy throwing up Traceback when trying to parse tabulated data
我在Windows Vista 64位上運行Scrapy.org版本2.7 64位。 我有一些Scrapy代碼試圖在以下代碼中包含的URL處解析表中包含的數據:
from scrapy.spider import Spider
from scrapy.selector import Selector
from scrapy.utils.markup import remove_tags
from scrapy.cmdline import execute
import re
class MySpider(Spider):
name = "wiki"
allowed_domains = ["whoscored.com"]
start_urls = ["http://www.whoscored.com/Players/3859/Fixtures/Wayne-Rooney"]
def parse(self, response):
for row in response.selector.xpath('//table[@id="player-fixture"]//tr[td[@class="tournament"]]'):
# Is this row contains goal symbols?
list_of_goals = row.xpath('//span[@title="Goal"')
if list_of_goals:
print remove_tags(list_of_goals).encode('utf-8')
execute(['scrapy','crawl','wiki'])
但是,它引發以下錯誤:
Traceback (most recent call last):
File "c:\Python27\lib\site-packages\twisted\internet\base.py", line 1201, in mainLoop
self.runUntilCurrent()
File "c:\Python27\lib\site-packages\twisted\internet\base.py", line 824, in runUntilCurrent
call.func(*call.args, **call.kw)
File "c:\Python27\lib\site-packages\twisted\internet\defer.py", line 383, in callback
self._startRunCallbacks(result)
File "c:\Python27\lib\site-packages\twisted\internet\defer.py", line 491, in _startRunCallbacks
self._runCallbacks()
--- <exception caught here> ---
File "c:\Python27\lib\site-packages\twisted\internet\defer.py", line 578, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "c:\Python27\lib\site-packages\scrapy\spider.py", line 56, in parse
raise NotImplementedError
exceptions.NotImplementedError:
誰能告訴我這里的問題是什么? 我正在嘗試對表中的所有項目進行屏幕打印,包括目標和輔助列中的數據。
謝謝
您的縮進是錯誤的:
class MySpider(Spider):
name = "wiki"
allowed_domains = ["whoscored.com"]
start_urls = ["http://www.whoscored.com/Players/3859/Fixtures/Wayne-Rooney"]
def parse(self, response):
for row in response.selector.xpath('//table[@id="player-fixture"]//tr[td[@class="tournament"]]'):
# Is this row contains goal symbols?
list_of_goals = row.xpath('//span[@title="Goal"')
if list_of_goals:
print remove_tags(list_of_goals).encode('utf-8')
使用Spider class
時,必須實現parse
方法,這就是該方法在源代碼中的樣子:
def parse(self, response):
raise NotImplementedError
您的縮進是錯誤的,因此解析不是該類的一部分,因此您尚未實現所需的方法。
raise NotImplementedError
可以確保在從Spider
基類繼承時編寫所需的parse
方法。
現在,您只需要找到正確的xpath
;)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.