我如何让Scrapy爬进python项目？

Question

I have a personal project which led me to use Selenium in order to get a public url address from a private [mail, password] couple. 我有一个个人项目，导致我使用Selenium以便从一对私人[邮件，密码]对获得公共URL地址。

I want to save info at this url and I followed the Scrapy tutorial to get how I could do it with this tool. 我想在此URL上保存信息，然后按照Scrapy教程获得如何使用此工具进行操作。 But is there a way to launch the crawling inside a Python project like MyScrapClass.crawl() instead of having the linux command scrapy crawl MyScrapProject ? 但是，有没有一种方法可以在MyScrapClass.crawl()这样的Python项目中启动爬网，而不是使用linux命令scrapy crawl MyScrapProject ？

Answer 1

Use CrawlerProcess or CrawlerRunner classes for running scrapy from within a python script.. 使用CrawlerProcess或CrawlerRunner类从python脚本中运行scrapy。

http://doc.scrapy.org/en/latest/topics/practices.html http://doc.scrapy.org/en/latest/topics/practices.html

example taken from the scrapy website : 例子来自于scrapy网站：

import scrapy
from scrapy.crawler import CrawlerProcess

class MySpider(scrapy.Spider):
    # Your spider definition
    ...

process = CrawlerProcess()
process.crawl(MySpider)
# the script will block here until the crawling is finished
process.start()

我如何让Scrapy爬进python项目？

问题描述

1 个解决方案

解决方案1
0 2016-03-21 01:20:57

我如何让Scrapy爬进python项目？

问题描述

1 个解决方案

解决方案1 0 2016-03-21 01:20:57

解决方案1
0 2016-03-21 01:20:57