Python Scrapy -> Use a scrapy spider as a function

Question

so I have the following Scrapy Spider in spiders.py

import scrapy 

class TwitchSpider(scrapy.Spider):
  name = "clips"

  def start_requests(self):
      urls = [
          f'https://www.twitch.tv/wilbursoot/clips?filter=clips&range=7d'
      ]

  def parse(self, response): 
    for clip in response.css('.tw-tower'):
      yield {
        'title': clip.css('::text').get()
      }

But the key aspect is that I want to call this spider as a function, in another file, instead of using scrapy crawl quotes in the console. Where can I read more on this, or whether this is possible at all? I checked through the Scrapy documentation, but I didn't find much

Answer 1

I'm kind of a beginner level developer but maybe you could try making the entire thing a function and then import that.

Answer 2

Put your other file in the same directory as your spider file. Then import the spider file like

import spider

Then you will have access to the spider file and can make a spider object.

spi = spider()

Then can call functions on that object such as

spi.parse()

This article shows you how to import other python files classes and functions https://csatlas.com/python-import-file-module/

Answer 3

Run the spider from main.py:

from scrapy.crawler import CrawlerProcess
from scrapy.utils.project import get_project_settings

if __name__ == "__main__":
    spider = 'TwitchSpider'
    settings = get_project_settings()
    # change/update settings:
    settings['USER_AGENT'] = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36'
    process = CrawlerProcess(settings)
    process.crawl(spider)
    process.start()

Run scrapy from a script .

Python Scrapy -> Use a scrapy spider as a function

Question

2 answers

solution1
0 2022-01-27 14:07:47

solution2
0 2022-01-27 14:17:52

solution3
0 2022-01-27 15:39:25

Python Scrapy -> Use a scrapy spider as a function

Question

2 answers

solution1 0 2022-01-27 14:07:47

solution2 0 2022-01-27 14:17:52

solution3 0 2022-01-27 15:39:25

solution1
0 2022-01-27 14:07:47

solution2
0 2022-01-27 14:17:52

solution3
0 2022-01-27 15:39:25