The target file urls.txt
contains all the url to be downloaded.
├─spiders
│ │ stockInfo.py
│ │ urls.txt
│ │ __init__.py
stockInfo.py
is my scrapy file.
import scrapy
import os
import re
class QuotesSpider(scrapy.Spider):
name = "stockInfo"
projectFile = r"d:/toturial/toturial/spiders/urls.txt"
with open(projectFile,"r") as f:
urls = f.readlines()
start_urls = [url.strip() for url in urls]
def parse(self, response):
pass
I have tested that the above stockInfo.py
can run successfully in my local pc end with command:
scrapy crawl stockInfo
Now i deploy the project into remote end scrapy hub
with
pip install shub
shub login
API key: xxxxxxxxxxxxxxxxx
shub deploy 380020
It run into trouble:
IOError: [Errno 2] No such file or directory: 'd:/toturial/toturial/spiders/urls.txt'
How to fix it when to deploy my scrapy
into the hub
? It is useful to rewrite
projectFile = r"d:/toturial/toturial/spiders/urls.txt"
as
projectFile = "./urls.txt"
when to run it in my local pc end.
Strangely, it is no use to rewrite
projectFile = r"d:/toturial/toturial/spiders/urls.txt"
as
projectFile = "./urls.txt"
when to run it in remote end scrapy hub
.
1.add new directory and move urls.txt
in it.
To add a new directory resources
,and save urls.txt
in it.
My new directory tree is as below.
tutorial
├─tutorial
│ ├─resources
| |--urls.txt
│ ├─spiders
| |--stockInfo.py
2.rewrite the setup.py as below.
from setuptools import setup, find_packages
setup(
name='tutorial',
version='1.0',
packages=find_packages(),
package_data={
'tutorial': ['resources/*.txt']
},
entry_points={
'scrapy': ['settings = tutorial.settings']
},
zip_safe=False,
)
3.rewrite stockInfo.py
as below.
import scrapy
import os
import re
import pkgutil
class QuotesSpider(scrapy.Spider):
name = "stockInfo"
data = pkgutil.get_data("tutorial", "resources/urls.txt")
data = data.decode()
start_urls = data.split("\r\n")
def parse(self, response):
pass
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.