简体   繁体   English

在centos 6.10 python 3.6上将Scrapyd作为守护程序运行

[英]Running Scrapyd as a daemon on centos 6.10 python 3.6

I am trying to run my scrapers on my dedicated centos 6.10 server. 我正在尝试在专用的centos 6.10服务器上运行我的刮板。 I got python 3.6.6 installed, created a venv , and installed a ran scrapyd from a pip install. 我安装了python 3.6.6 ,创建了venv ,并从pip安装安装了scrapyd The command scrapyd shows this: 命令scrapyd显示如下:

2018-10-24T12:23:56-0700 [-] Loading /usr/local/lib/python3.6/site-packages/scrapyd/txapp.py...
2018-10-24T12:23:57-0700 [-] Scrapyd web console available at http://127.0.0.1:6800/
2018-10-24T12:23:57-0700 [-] Loaded.
2018-10-24T12:23:57-0700 [twisted.scripts._twistd_unix.UnixAppLogger#info] twistd 18.7.0 (/usr/local/bin/python3.6 3.6.6) starting up.
2018-10-24T12:23:57-0700 [twisted.scripts._twistd_unix.UnixAppLogger#info] reactor class: twisted.internet.epollreactor.EPollReactor.
2018-10-24T12:23:57-0700 [-] Site starting on 6800
2018-10-24T12:23:57-0700 [twisted.web.server.Site#info] Starting factory <twisted.web.server.Site object at 0x7f4661cdf940>
2018-10-24T12:23:57-0700 [Launcher] Scrapyd 1.2.0 started: max_proc=16, runner='scrapyd.runner'

Totally cool. 太酷了。 Now I have a couple questions. 现在我有几个问题。

1- If this is running on my dedicated server, does that mean that scrapyd web console is then at [serverIP]:6800 ? 1-如果该服务器正在我的专用服务器上运行,是否表示scrapyd Web控制台位于[serverIP]:6800 Or, at least, is it supposed to be there? 或者,至少应该在那里? Because while the command is running, it doesn't appear. 因为当命令运行时,它不会出现。 The website can't be found. 找不到该网站。 So, I sort of hit a brick wall with this. 所以,我有点撞墙了。

2- Another thing is that I don't want to have to leave a browser or SSH terminal open to get scrapyd running. 2-另一件事是,我不想让浏览器或SSH终端保持打开状态以使scrapyd运行。 All of the articles I have read have advised that there is no proper RPM package for scrapyd and until somebody makes one I am out of luck because I am not personally a linux expert I am surprised I made it this far. 我读过的所有文章都建议没有合适的RPM软件包以供scrapyd ,直到有人使我运气不佳,因为我个人不是Linux专家,我很惊讶我做到了这一点。

So I guess this is an issue for running scrapyd as a daemon on the server because it needs special files. 所以我想这是在服务器scrapyd作为守护程序运行的问题,因为它需要特殊的文件。 I can install scrapyd directly from the git? 我可以直接从git安装scrapyd吗? It didn't seem however that even the git had the right files that I seemingly needed for this project to work. 但是,即使git似乎也没有这个项目正常运行所需的正确文件。

If somebody could help me on the right track, guide me or provide me with an article where somebody has done the whole process on 6.10 that would be awesome. 如果有人可以在正确的道路上为我提供帮助,请指导我或向我提供一篇有关有人在6.10上完成了整个过程的文章,那真是太棒了。

1 - use scrapyd config file add bind_address=0.0.0.0 in it 1-使用scrapyd配置文件在其中添加bind_address = 0.0.0.0

# cat ~/.scrapyd.conf [scrapyd] bind_address=0.0.0.0

start scrapyd and you should see something like 开始刮擦,你应该看到类似

2018-11-11T13:58:08-0800 [-] Scrapyd web console available at http://0.0.0.0:6800/

now you should be able to access the web interface from [serverIP]:6800 现在您应该可以从[serverIP]:6800访问Web界面

2 - you can always use tmux for this, read https://hackernoon.com/a-gentle-introduction-to-tmux-8d784c404340 2-您随时可以使用tmux进行阅读,请阅读https://hackernoon.com/a-gentle-introduction-to-tmux-8d784c404340

You can use the @Rene_Xu answer and check the firewall to see if its dropping external connections. 您可以使用@Rene_Xu答案并检查防火墙以查看其是否断开了外部连接。 To keep alive the scrapyd you can write a simple script and turn it into a daemon or just use crontab as explained here 为了永葆scrapyd你可以写一个简单的脚本,并把它变成一个后台进程,或只是使用crontab作为解释这里

另外,请检查您的专用环境设置,例如,如果您托管在AWS中,则需要设置安全组,网络ACL等,以允许在此特定端口上的传入请求。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM