简体   繁体   English

python请求中的连接错误

[英]Connection error in python-requests

I'm trying to search using beautifulsoup with anaconda for python 3.6. 我正在尝试使用beautifulsoup和anaconda来搜索python 3.6。
I am trying to scrape accuweather.com to find the weather in Tel Aviv. 我试图爬accurweather.com来查找特拉维夫的天气。

This is my code: 这是我的代码:

from bs4 import BeautifulSoup
import requests
data=requests.get("https://www.accuweather.com/he/il/tel- 
aviv/215854/weather-forecast/215854") 
soup=BeautifulSoup(data.text,"html parser")
soup.find('div',('class','info'))

I get this error: 我收到此错误:

raise ConnectionError(err, request=request)
ConnectionError: ('Connection aborted.', OSError("(10060, 
'WSAETIMEDOUT')",))

What can I do and what does this error mean? 我该怎么办,这个错误是什么意思?

The problem does not come from the code, but from the website. 问题不在于代码,而在于网站。
If you add User-Agent field in the header of the request it will look like it comes from a browser. 如果在请求的标题中添加User-Agent字段,则该字段看起来像来自浏览器。

Example: 例:

from bs4 import BeautifulSoup
import requests

headers = {
     'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36',
}

data=requests.get("https://www.accuweather.com/he/il/tel-aviv/215854/weather-forecast/215854", headers=headers) 

What does this error mean 这个错误是什么意思

Googling for "errno 10600" yields quite a few results. 搜寻“ errno 10600”会产生很多结果。 Basically, it's a low-level network error (it's not http specific, you can have the same issue for any kind of network connection), whose canonical description is 基本上,这是一个低级的网络错误(这不是http特定的,对于任何类型的网络连接您都可能遇到相同的问题),其规范描述为

A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond 连接尝试失败,因为一段时间后连接方未正确响应,或者建立连接失败,因为连接的主机未能响应

IOW, your system failed to connect to the host. IOW,您的系统无法连接到主机。 This might come from a lot of reasons, either temporary (like your internet connection is down) or not (like a proxy - if you are behind a proxy - blocking access to this host, etc), or quite simply (as is the case here) the host blocking your requests. 这可能是出于多种原因,要么是临时的(例如您的Internet连接断开),要么不是临时的(例如,代理-如果您位于代理之后-阻止对此主机的访问等),或者很简单(如实际情况)此处)主机阻止了您的请求。

The first thing to do when you have such an error is to check your internet connection, then try to get the url in your browser. 出现此类错误时,要做的第一件事是检查您的Internet连接,然后尝试在浏览器中获取该url。 If you can get it in your browser then it's most often the host blocking you, most often based on your client's "user-agent" header (the client here is requests ), and specifying a "standard" user-agent header as explained in newbie's answer should solve the problem (and it does in this case, or at least it did for me). 如果您可以在浏览器中找到它,则它通常是主机阻止您,最常见的是基于客户端的“用户代理”标头(此处为客户端requests ),并指定“标准”用户代理标头,如新手的答案应该可以解决该问题(在这种情况下,或者至少对我来说是对的)。

NB : to set the user agent: 注意:设置用户代理:

headers = {
     'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36',
}
data = requests.get("https://www.accuweather.com/he/il/tel-aviv/215854/weather-forecast/215854", headers=headers) 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM