I can not pass references. When starting a spider, I'm not getting data Help with code.
I'm a beginner in Scrapy
import scrapy
from movie.items import AfishaCinema
class AfishaCinemaSpider(scrapy.Spider):
name = 'afisha-cinema'
allowed_domains = ['kinopoisk.ru']
start_urls = ['https://www.kinopoisk.ru/premiere/ru/']
def parse(self, response):
links = response.css('div.textBlock>span.name_big>a').xpath(
'@href').extract()
for link in links:
yield scrapy.Request(link, callback=self.parse_moov,
dont_filter=True)
def parse_moov(self, response):
item = AfishaCinema()
item['name'] = response.css('h1.moviename-big::text').extract()
The reason you are not getting the data is that you don't yield
any from your parse_moov
method. As per the documentation , parse method must return an iterable of Request
and/or dicts or Item
objects . So add
yield item
at the end of your parse_moov
method.
Also, to be able to run your code, I had to modify
yield scrapy.Request(link, callback=self.parse_moov, dont_filter=True)
to
yield scrapy.Request(response.urljoin(link), callback=self.parse_moov, dont_filter=True)
in the parse
method, otherwise I was getting errors:
ValueError: Missing scheme in request url: /film/monstry-na-kanikulakh-3-more-zovyot-2018-950968/
(That's because Request
constructor needs absolute URL while the page contains relative URLs.)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.