简体   繁体   中英

Running Scrapyd as a daemon on centos 6.10 python 3.6

I am trying to run my scrapers on my dedicated centos 6.10 server. I got python 3.6.6 installed, created a venv , and installed a ran scrapyd from a pip install. The command scrapyd shows this:

2018-10-24T12:23:56-0700 [-] Loading /usr/local/lib/python3.6/site-packages/scrapyd/txapp.py...
2018-10-24T12:23:57-0700 [-] Scrapyd web console available at http://127.0.0.1:6800/
2018-10-24T12:23:57-0700 [-] Loaded.
2018-10-24T12:23:57-0700 [twisted.scripts._twistd_unix.UnixAppLogger#info] twistd 18.7.0 (/usr/local/bin/python3.6 3.6.6) starting up.
2018-10-24T12:23:57-0700 [twisted.scripts._twistd_unix.UnixAppLogger#info] reactor class: twisted.internet.epollreactor.EPollReactor.
2018-10-24T12:23:57-0700 [-] Site starting on 6800
2018-10-24T12:23:57-0700 [twisted.web.server.Site#info] Starting factory <twisted.web.server.Site object at 0x7f4661cdf940>
2018-10-24T12:23:57-0700 [Launcher] Scrapyd 1.2.0 started: max_proc=16, runner='scrapyd.runner'

Totally cool. Now I have a couple questions.

1- If this is running on my dedicated server, does that mean that scrapyd web console is then at [serverIP]:6800 ? Or, at least, is it supposed to be there? Because while the command is running, it doesn't appear. The website can't be found. So, I sort of hit a brick wall with this.

2- Another thing is that I don't want to have to leave a browser or SSH terminal open to get scrapyd running. All of the articles I have read have advised that there is no proper RPM package for scrapyd and until somebody makes one I am out of luck because I am not personally a linux expert I am surprised I made it this far.

So I guess this is an issue for running scrapyd as a daemon on the server because it needs special files. I can install scrapyd directly from the git? It didn't seem however that even the git had the right files that I seemingly needed for this project to work.

If somebody could help me on the right track, guide me or provide me with an article where somebody has done the whole process on 6.10 that would be awesome.

1 - use scrapyd config file add bind_address=0.0.0.0 in it

# cat ~/.scrapyd.conf [scrapyd] bind_address=0.0.0.0

start scrapyd and you should see something like

2018-11-11T13:58:08-0800 [-] Scrapyd web console available at http://0.0.0.0:6800/

now you should be able to access the web interface from [serverIP]:6800

2 - you can always use tmux for this, read https://hackernoon.com/a-gentle-introduction-to-tmux-8d784c404340

You can use the @Rene_Xu answer and check the firewall to see if its dropping external connections. To keep alive the scrapyd you can write a simple script and turn it into a daemon or just use crontab as explained here

另外,请检查您的专用环境设置,例如,如果您托管在AWS中,则需要设置安全组,网络ACL等,以允许在此特定端口上的传入请求。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM