简体   繁体   中英

A CONFUSE about the replacement proxy ip problem of the Scrapy framework

I am currently trying to randomly try the proxy ip in the Scrapy framework.(I use Python3.6 and Scrapy version is 1.5.1,My project name is ip and the work name is ip_test), and I meet this confusing error:

raise SchemeNotSupported("Unsupported scheme: %r" % (uri.scheme,)) twisted.web.error.SchemeNotSupported: Unsupported scheme: b'' I dont know where I was wrong, and this is my middlewares.py

class IpDownloaderMiddleware(object):
PROXY = ["117.95.7.27:11170", "119.114.17.24:38715", "183.149.2.23:28970", "117.60.3.6:26965",
         "123.245.11.50:25550"]
def process_request(self, request, spider):
    proxy = random.choice(self.PROXY)
    request.meta["proxy"] = proxy

And this is my settings.py

DOWNLOADER_MIDDLEWARES = {'ip.middlewares.IpDownloaderMiddleware': 100,}

thx!

As indicated by the error message, Scrapy (or to be precise, Twisted) requires the proxy URL to include a scheme, instead of <netloc>:<port> only.

Eg instead of setting request.meta["proxy"] = '117.95.7.27:11170' , you need request.meta["proxy"] = 'http://117.95.7.27:11170'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM