I had started scrapy with Official Tutorial, but I can't go with it successfully.My code is totally same with official one.
import scrapy
class QuotesSpider(scrapy.Spider):
name = 'Quotes';
def start_requests(self):
urls = [
'http://quotes.toscrape.com/page/1/',
]
for url in urls:
yield scrapy.Request(url=url,callback = self.parse);
def parse(self, response):
page = response.url.split('/')[-2];
print('--------------------------------->>>>');
for quote in response.css('div.quote'):
yield {
'text': quote.css('span.text::text').get(),
'author': quote.css('small.author::text').get(),
'tags': quote.css('div.tags a.tag::text').getall(),
}
When i execute it on CMD with instruction (scrapy crawl Quotes) and the result like this:
2020-12-20 10:00:25 [scrapy.core.engine] DEBUG: Crawled (200) <GET http://quotes.toscrape.com/page/1/> (referer: None)
2020-12-20 10:00:26 [scrapy.core.scraper] ERROR: Spider error processing <GET http://quotes.toscrape.com/page/1/> (referer: None)
Traceback (most recent call last):
File "c:\users\a\appdata\local\programs\python\python38-32\lib\site-packages\twisted\internet\defer.py", line 1418, in _inlineCallbacks
result = g.send(result)
StopIteration: <200 http://quotes.toscrape.com/page/1/>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "c:\users\a\appdata\local\programs\python\python38-32\lib\site-packages\scrapy\utils\defer.py", line 55, in mustbe_deferred
result = f(*args, **kw)
File "c:\users\a\appdata\local\programs\python\python38-32\lib\site-packages\scrapy\core\spidermw.py", line 58, in process_spider_input
return scrape_func(response, request, spider)
File "c:\users\a\appdata\local\programs\python\python38-32\lib\site-packages\scrapy\core\scraper.py", line 149, in call_spider
warn_on_generator_with_return_value(spider, callback)
File "c:\users\a\appdata\local\programs\python\python38-32\lib\site-packages\scrapy\utils\misc.py", line 245, in warn_on_generator_with_return_value
if is_generator_with_return_value(callable):
File "c:\users\a\appdata\local\programs\python\python38-32\lib\site-packages\scrapy\utils\misc.py", line 230, in is_generator_with_return_value
tree = ast.parse(dedent(inspect.getsource(callable)))
File "c:\users\a\appdata\local\programs\python\python38-32\lib\ast.py", line 47, in parse
return compile(source, filename, mode, flags,
File "<unknown>", line 1
def parse(self, response):
^
IndentationError: unexpected indent
2020-12-20 10:00:26 [scrapy.core.engine] INFO: Closing spider (finished)
2020-12-20 10:00:26 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
I check it many times but I still do not know how to deal with it!
There is a IndentationError
.Need to fix code indentation. Its work fine.
You might find a solution for your issue here
It is not about the yield, I think either all the semicolons or maybe the last comma after getall()
'tags': quote.css('div.tags a.tag::text').getall(),
might cause the interpreter to expect sth. else.
Remove the semicolons and the last comma - does it still not work?
The error output shows the indentation error at:
def parse
^
this tells you, that something before there caused it, so I guess it should be the first semicolon.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.