使用 Python 中的代理访问网页

Question

I am new in python.我是 python 的新手。 I am doing web scraping.我正在做 web 刮。 I use googlesearch module to get links in python.我使用 googlesearch 模块获取 python 中的链接。 But after so many requests google blocks my ip.但是经过这么多请求，谷歌阻止了我的 ip。 So I used tor and then by using socks I do the same task but the same thing happened again.所以我用了tor，然后用袜子我做了同样的任务，但同样的事情又发生了。 Now I come to the solution that I should use proxies.现在我提出了应该使用代理的解决方案。 But when I made a request using proxies it throws an exception.但是当我使用代理发出请求时，它会引发异常。 Below is the code which I use.下面是我使用的代码。 I run chrome browser by setting proxies manually and it works very well but why it is throwing exception when I access using python.我通过手动设置代理来运行 chrome 浏览器，它运行良好，但是为什么当我使用 python 访问时它会抛出异常。

import requests
    proxies = {'http': 'socks5://user:pass@host:port',
               'https': 'socks5://user:pass@host:port'}
    resp = requests.get('http://https://www.google.com', proxies=proxies )

Answer 1

Please check your URL.请检查您的 URL。 the URL should be https://www.google.com/ but your code has the wrong URL. URL 应该是https://www.google.com/但您的代码有错误的 URL。 other than that if you still get an error try adding user-agents with the request.除此之外，如果您仍然遇到错误，请尝试在请求中添加用户代理。

import requests

proxies = {
    'http': 'socks5://user:pass@host:port',
    'https': 'socks5://user:pass@host:port'
}
headers = {
    "User-Agent": "Mozilla/5.0 (X11Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36"
}
resp = requests.get('https://www.google.com', proxies=proxies, headers=headers)

使用 Python 中的代理访问网页

问题描述

1 个解决方案

解决方案1
0 2021-04-03 15:30:42

使用 Python 中的代理访问网页

问题描述

1 个解决方案

解决方案1 0 2021-04-03 15:30:42

解决方案1
0 2021-04-03 15:30:42