Scrapy: how to pass links

Question

I can not pass references. When starting a spider, I'm not getting data Help with code.

I'm a beginner in Scrapy

import scrapy
from movie.items import AfishaCinema

class AfishaCinemaSpider(scrapy.Spider):
    name = 'afisha-cinema'
    allowed_domains = ['kinopoisk.ru']
    start_urls = ['https://www.kinopoisk.ru/premiere/ru/']

    def parse(self, response):
    links = response.css('div.textBlock>span.name_big>a').xpath(
        '@href').extract()
    for link in links:
        yield scrapy.Request(link, callback=self.parse_moov,
                             dont_filter=True)

def parse_moov(self, response):
    item = AfishaCinema()
    item['name'] = response.css('h1.moviename-big::text').extract()

Answer 1

The reason you are not getting the data is that you don't yield any from your parse_moov method. As per the documentation , parse method must return an iterable of Request and/or dicts or Item objects . So add

yield item

at the end of your parse_moov method.

Also, to be able to run your code, I had to modify

yield scrapy.Request(link, callback=self.parse_moov, dont_filter=True)

to

yield scrapy.Request(response.urljoin(link), callback=self.parse_moov, dont_filter=True)

in the parse method, otherwise I was getting errors:

ValueError: Missing scheme in request url: /film/monstry-na-kanikulakh-3-more-zovyot-2018-950968/

(That's because Request constructor needs absolute URL while the page contains relative URLs.)

Scrapy: how to pass links

Question

1 answers

solution1
2 2018-07-14 14:01:09

Scrapy: how to pass links

Question

1 answers

solution1 2 2018-07-14 14:01:09

solution1
2 2018-07-14 14:01:09