How to extract data for next page using Scrapy

So I have written a script which has 2 functions:


  • extract urls from the main url and sends those to parse_city() to extract each url's details
  • Once this is done, parse() extracts the next page and calls itself to repeat the above step


  • extracts the details from each url.

My page one is extracted fine using the logic, but the next pages don't seem to be going over to parse_city().

Here is the dummy code:

import scrapy
from bs4 import BeautifulSoup as bs
from scrapy import Request
from scrapy.spiders import CrawlSpider, Rule
from scrapy.linkextractors import LinkExtractor

class TjSpider(scrapy.Spider):
    global count
    count = 0
    name = 'TJ'
    allowed_domains = ['example.com']
    start_urls = ['http://example.com/xyz']
    def parse(self, response):
        urls = response.xpath('//h2/a/@href').getall()
        for url in urls:
            base_url = "http://example.com"
            yield Request(base_url+url, callback=self.parse_city)
            next_page = response.xpath('//div[@class="fr"]/em[@class="active"]/following-sibling::em[1]/a/@href').extract()
            if next_p:
                yield Request(next_p,callback=self.parse)
        except Exception as e:
            print("Pages over") 

def parse_city(self, response):
    global count
    title = response.xpath("<xpath to title>").extract()
    yield {
       'title' = title

It prints the urls extracted too for each page. But doesn't go into parse_city() for next pages. I am new to scrapy, I don't understand what's going wrong


You have a syntax error in parse_city :

yield {
   'title': title

UPDATE You have a lot of offsite requests filtered. You have:

allowed_domains = ['example.com']

but trying to get next_page from abc.com .

