简体   繁体   中英

Using python request with a big list of proxies

I am trying to create a program that uses a list of proxies iteratively so each proxy will be used from beginning to finish then starting all over again. The way to use proxies in request seems to be like following.

proxyDict = { 
              "http"  : "http://177.86.8.166:3128", 
              "http" : "http://177.223.187.126:3128" 
            }

r = requests.get(url, headers=headers, proxies=proxyDict)

I have a big list of proxies like below.

177.86.8.166:3128
177.69.237.53:3128
177.223.187.126:3128
177.101.172.14:3128
177.185.114.89:53281
177.128.192.125:8089
177.128.210.250:8080

I have thought about using a loop to append all these proxies in a proxyDict var in memory. Than run my program. Is this the best way to do. I also want to repeat a request in case the proxy fails to work properly with another proxy and this should continue until request is made successfully. I am thinking of using a try catch block for this is this the best way to do it? Or is there a better way.

I've just done something similar thing although I used grequests. A couple of thoughts for you.. I'd add a timeout to your requests or your code will hang:

>>>> r = requests.get(url, headers=headers, proxies=my_proxy, timeout=5)

Each request will have a status_code so use this to check if the request was successful, I usually try a few times just in case there was a timeout for example:

>>> import requests
>>> r = requests.get('http://notarealsiteatall.org/status/404')
>>> r.status_code
404

Then if the request say fails 5 times you can move to the next proxy.

if tries > 5:
    my_proxy = new_proxy_server

I just created a list and did a for loop iterating through them.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM