简体   繁体   中英

Error occured when getting the data file through URL using python

I tried to load data from a URL

url = 'http://raw.githubusercontent.com/justmarkham/DAT8/master/data/chipotle.tsv'
chipo = pd.read_csv(url, sep = '\t')

and there is an error

URLError: <urlopen error [Errno 11004] getaddrinfo failed>

I've checked this answer but this does not help.

I've also tried fetching data using requests and the error occured again

ConnectionError: HTTPConnectionPool(host='raw.githubusercontent.com', port=80): Max retries exceeded with url: /justmarkham/DAT8/master/data/chipotle.tsv (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x0000029B29E43748>: Failed to establish a new connection: [Errno 11004] getaddrinfo failed'))

It seems that there is something wrong with DNS so I edited the hosts file but it does not help either. How should I fix this problem?

Thanks a lot.

Case solved. It turns out to be the problem of the DNS and I need the proxy to get access to the resources. This could explain why this problem is not reproducible.

import socket
import socks
socks.set_default_proxy(socks.SOCKS5, '127.0.0.1', 10808)
socket.socket = socks.socksocket

url = 'https://raw.githubusercontent.com/justmarkham/DAT8/master/data/chipotle.tsv'
chipo = pd.read_csv(url, sep = '\t')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM