[英]problem with python requests while using proxies
I am trying to scrape a website using python requests.我正在尝试使用 python 请求抓取网站。 We can only scrape the website using proxies so I implemented the code for that.我们只能使用代理来抓取网站,所以我为此实现了代码。 However its banning all my requests even when i am using proxies, So I used a website https://api.ipify.org/?format=json to check whether proxies working properly or not.然而,即使我使用代理,它也会禁止我的所有请求,所以我使用了一个网站https://api.ipify.org/?format=json来检查代理是否正常工作。 I found it showing my original IP even while using proxies.我发现它甚至在使用代理时也显示了我的原始 IP。 The code is below代码如下
from concurrent.futures import ThreadPoolExecutor
import string, random
import requests
import sys
http = []
#loading http into the list
with open(sys.argv[1],"r",encoding = "utf-8") as data:
for i in data:
http.append(i[:-1])
data.close()
url = "https://api.ipify.org/?format=json"
def fetch(session, url):
for i in range(5):
proxy = {'http': 'http://'+random.choice(http)}
try:
with session.get(url,proxies = proxy, allow_redirects=False) as response:
print("Proxy : ",proxy," | Response : ",response.text)
break
except:
pass
# @timer(1, 5)
if __name__ == '__main__':
with ThreadPoolExecutor(max_workers=1) as executor:
with requests.Session() as session:
executor.map(fetch, [session] * 100, [url] * 100)
executor.shutdown(wait=True)
I tried a lot but didn't understand how my ip address is getting shown instead of the proxy ipv4.我尝试了很多,但不明白如何显示我的 IP 地址而不是代理 ipv4。 You will find output of the code here https://imgur.com/a/z02uSvi您将在此处找到代码的输出https://imgur.com/a/z02uSvi
The problem that you have set proxy for http
and sending request to website which uses https
.您为http
设置代理并向使用https
网站发送请求的问题。 Solution is simple:解决方法很简单:
proxies = dict.fromkeys(('http', 'https', 'ftp'), 'http://' + random.choice(http))
# You can set proxy for session
session.proxies.update(proxies)
response = session.get(url)
# Or you can pass proxy as argument
response = session.get(url, proxies=proxies)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.