简体   繁体   中英

Scrapy change / update public IP via Proxy

I am using Scrapy for crawling Google and I want to change my IP from code. I am getting same public IP as my local from output even though proxy in meta of response is getting changed. If I go to that VM and get a response from that site then it shows me VM's IP which is I am using in request.meta['proxy'] = ip but from code it only shows Local Public IP

This is my code.

middleware.py

class ProxyMiddleware(object):
def process_request(self, request, spider):
    encoded_user_pass = base64.encodestring(('%s:%s' % (username, pass)).encode()).decode().replace('\n', '').strip()
    request.headers['Proxy-Authorization'] = 'Basic ' + encoded_user_pass
    request.meta['proxy'] = ip

settings.py

DOWNLOADER_MIDDLEWARES = {
    'tutorial.middlewares.RotateUserAgentMiddleware': 400,
    'tutorial.middlewares.ProxyMiddleware': 100,
    'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware': None,
    'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware': 110,
}

spider1.py

 request = scrapy.Request(url='http://checkip.dyndns.org/', callback=self.check_ip)

 def check_ip(self, response):
    print(response.meta)
    pub_ip = response.xpath('//body/text()').re('\d+\.\d+\.\d+\.\d+')[0]
    print("My public IP is: " + pub_ip)

Output:

{'proxy': 'http://51.162.81.60', 'download_timeout': 360.0, 'download_slot': 'checkip.dyndns.org', 'download_latency': 19.054762840270996}
My public IP is: 118.110.179.234

As per my understanding, the proxy ip needs to be ip of a proxy server as in a proxy server should be reachable at the ip provided by you. You can't simply assign any random ip to any request. If you want to rotate IP that is a different thing altogether.

Also just in case mention scheme(http, https) and port. Not sure whether scrapy falls back to any default if scheme and port is not mentioned.

Also, please see the documentation .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM