[英]Accessing Webpages Using Proxies In Python
I am new in python.我是 python 的新手。 I am doing web scraping.
我正在做 web 刮。 I use googlesearch module to get links in python.
我使用 googlesearch 模块获取 python 中的链接。 But after so many requests google blocks my ip.
但是经过这么多请求,谷歌阻止了我的 ip。 So I used tor and then by using socks I do the same task but the same thing happened again.
所以我用了tor,然后用袜子我做了同样的任务,但同样的事情又发生了。 Now I come to the solution that I should use proxies.
现在我提出了应该使用代理的解决方案。 But when I made a request using proxies it throws an exception.
但是当我使用代理发出请求时,它会引发异常。 Below is the code which I use.
下面是我使用的代码。 I run chrome browser by setting proxies manually and it works very well but why it is throwing exception when I access using python.
我通过手动设置代理来运行 chrome 浏览器,它运行良好,但是为什么当我使用 python 访问时它会抛出异常。
import requests
proxies = {'http': 'socks5://user:pass@host:port',
'https': 'socks5://user:pass@host:port'}
resp = requests.get('http://https://www.google.com', proxies=proxies )
Please check your URL.请检查您的 URL。 the URL should be
https://www.google.com/
but your code has the wrong URL. URL 应该是
https://www.google.com/
但您的代码有错误的 URL。 other than that if you still get an error try adding user-agents with the request.除此之外,如果您仍然遇到错误,请尝试在请求中添加用户代理。
import requests
proxies = {
'http': 'socks5://user:pass@host:port',
'https': 'socks5://user:pass@host:port'
}
headers = {
"User-Agent": "Mozilla/5.0 (X11Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36"
}
resp = requests.get('https://www.google.com', proxies=proxies, headers=headers)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.