使用 python 通过 URL 获取数据文件时发生错误

Question

I tried to load data from a URL我试图从 URL 加载数据

url = 'http://raw.githubusercontent.com/justmarkham/DAT8/master/data/chipotle.tsv'
chipo = pd.read_csv(url, sep = '\t')

and there is an error并且有一个错误

URLError: <urlopen error [Errno 11004] getaddrinfo failed>

I've checked this answer but this does not help.我已经检查了这个答案，但这没有帮助。

I've also tried fetching data using requests and the error occured again我也尝试过使用requests获取数据，错误再次发生

ConnectionError: HTTPConnectionPool(host='raw.githubusercontent.com', port=80): Max retries exceeded with url: /justmarkham/DAT8/master/data/chipotle.tsv (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x0000029B29E43748>: Failed to establish a new connection: [Errno 11004] getaddrinfo failed'))

It seems that there is something wrong with DNS so I edited the hosts file but it does not help either. DNS 似乎有问题，所以我编辑了主机文件，但它也无济于事。 How should I fix this problem?我应该如何解决这个问题？

Thanks a lot.非常感谢。

Answer 1

Case solved.案子解决了。 It turns out to be the problem of the DNS and I need the proxy to get access to the resources.原来是 DNS 的问题，我需要代理才能访问资源。 This could explain why this problem is not reproducible.这可以解释为什么这个问题是不可重现的。

import socket
import socks
socks.set_default_proxy(socks.SOCKS5, '127.0.0.1', 10808)
socket.socket = socks.socksocket

url = 'https://raw.githubusercontent.com/justmarkham/DAT8/master/data/chipotle.tsv'
chipo = pd.read_csv(url, sep = '\t')

使用 python 通过 URL 获取数据文件时发生错误

问题描述

1 个解决方案

解决方案1
0 已采纳 2021-01-19 05:52:52

使用 python 通过 URL 获取数据文件时发生错误

问题描述

1 个解决方案

解决方案1 0 已采纳 2021-01-19 05:52:52

解决方案1
0 已采纳 2021-01-19 05:52:52