简体   繁体   中英

Connection error in python-requests

I'm trying to search using beautifulsoup with anaconda for python 3.6.
I am trying to scrape accuweather.com to find the weather in Tel Aviv.

This is my code:

from bs4 import BeautifulSoup
import requests
data=requests.get("https://www.accuweather.com/he/il/tel- 
aviv/215854/weather-forecast/215854") 
soup=BeautifulSoup(data.text,"html parser")
soup.find('div',('class','info'))

I get this error:

raise ConnectionError(err, request=request)
ConnectionError: ('Connection aborted.', OSError("(10060, 
'WSAETIMEDOUT')",))

What can I do and what does this error mean?

The problem does not come from the code, but from the website.
If you add User-Agent field in the header of the request it will look like it comes from a browser.

Example:

from bs4 import BeautifulSoup
import requests

headers = {
     'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36',
}

data=requests.get("https://www.accuweather.com/he/il/tel-aviv/215854/weather-forecast/215854", headers=headers) 

What does this error mean

Googling for "errno 10600" yields quite a few results. Basically, it's a low-level network error (it's not http specific, you can have the same issue for any kind of network connection), whose canonical description is

A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond

IOW, your system failed to connect to the host. This might come from a lot of reasons, either temporary (like your internet connection is down) or not (like a proxy - if you are behind a proxy - blocking access to this host, etc), or quite simply (as is the case here) the host blocking your requests.

The first thing to do when you have such an error is to check your internet connection, then try to get the url in your browser. If you can get it in your browser then it's most often the host blocking you, most often based on your client's "user-agent" header (the client here is requests ), and specifying a "standard" user-agent header as explained in newbie's answer should solve the problem (and it does in this case, or at least it did for me).

NB : to set the user agent:

headers = {
     'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36',
}
data = requests.get("https://www.accuweather.com/he/il/tel-aviv/215854/weather-forecast/215854", headers=headers) 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM