Scrapy蜘蛛只刮了2页，不去下一页

Question

当我运行此代码时，蜘蛛只抓取 2 页并停止。 它不会转到下一页。

# -*- coding: utf-8 -*-
import scrapy


class P1Spider(scrapy.Spider):
    name = 'p1'
    allowed_domains = ['www.visit.ferienmesse.ch']
    start_urls = ['https://www.visit.ferienmesse.ch/de/aussteller']

    def parse(self, response):

        for data in response.xpath('//ul[@class="ngn-search-list ngn-mobile-filter"]/li'):
            yield {
                'Link': response.urljoin(data.xpath('.//h2[@class="ngn-content-box-title"]/a/@href').get()),
                'Title': data.xpath('//h2[@class="ngn-content-box-title"]/a/bdi/text()').get(),
                'Address': data.xpath('.//span[@class="ngn-hallname"]/text()').get(),
                'Code': data.xpath('.//span[@class="ngn-stand"]/text()').get()
            }

        next_page = response.xpath('//li[@class="arrow "]/a/@href').get()

        if next_page:
            yield scrapy.Request(url=response.urljoin(next_page), callback=self.parse)

Answer 1

将下一页选择器更改为此，看看它是否有效：

next_page = response.css('.pagination li.arrow a[rel="next"]::attr(href)').get()

原因

从第二页开始，你有 2 li 类的arrow 。

您可以在此处阅读有关选择器的更多信息： https : //docs.scrapy.org/en/latest/topics/selectors.html

Scrapy蜘蛛只刮了2页，不去下一页

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-03-26 11:10:22

Scrapy蜘蛛只刮了2页，不去下一页

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-03-26 11:10:22

解决方案1
1 已采纳 2020-03-26 11:10:22