[英]How can I get have Scrapy crawling inside a python project ?
I have a personal project which led me to use Selenium in order to get a public url address from a private [mail, password] couple. 我有一个个人项目,导致我使用Selenium以便从一对私人[邮件,密码]对获得公共URL地址。
I want to save info at this url and I followed the Scrapy tutorial to get how I could do it with this tool. 我想在此URL上保存信息,然后按照Scrapy教程获得如何使用此工具进行操作。 But is there a way to launch the crawling inside a Python project like
MyScrapClass.crawl()
instead of having the linux command scrapy crawl MyScrapProject
? 但是,有没有一种方法可以在
MyScrapClass.crawl()
这样的Python项目中启动爬网,而不是使用linux命令scrapy crawl MyScrapProject
?
Use CrawlerProcess or CrawlerRunner classes for running scrapy from within a python script.. 使用CrawlerProcess或CrawlerRunner类从python脚本中运行scrapy。
http://doc.scrapy.org/en/latest/topics/practices.html http://doc.scrapy.org/en/latest/topics/practices.html
example taken from the scrapy website : 例子来自于scrapy网站 :
import scrapy
from scrapy.crawler import CrawlerProcess
class MySpider(scrapy.Spider):
# Your spider definition
...
process = CrawlerProcess()
process.crawl(MySpider)
# the script will block here until the crawling is finished
process.start()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.