简体   繁体   English

Python:加载 MNIST 数据时出错

[英]Python: Error when loading MNIST data

Error occured when I was loading the MNIST data using the following code.(anaconda has already been installed and coded on online Jupyter notebook.)当我使用以下代码加载 MNIST 数据时发生错误。(anaconda 已经在在线 Jupyter notebook 上安装和编码。)

from sklearn.datasets import fetch_mldata
mnist = fetch_mldata('MNIST original')

Timeouterror appeared and I have no idea where I made mistakes.出现超时错误,我不知道我在哪里犯了错误。 I have closed my vpn proxy and it didnt work.我已经关闭了我的 vpn 代理,但它没有用。 Help!帮助!

TimeoutError                              Traceback (most recent call last)
<ipython-input-1-3ba7b9c02a3b> in <module>()
      1 from sklearn.datasets import fetch_mldata
----> 2 mnist = fetch_mldata('MNIST original')

~\Anaconda3\lib\site-packages\sklearn\datasets\mldata.py in fetch_mldata(dataname, target_name, data_name, transpose_data, data_home)
    152         urlname = MLDATA_BASE_URL % quote(dataname)
    153         try:
--> 154             mldata_url = urlopen(urlname)
    155         except HTTPError as e:
    156             if e.code == 404:

~\Anaconda3\lib\urllib\request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context)
    221     else:
    222         opener = _opener
--> 223     return opener.open(url, data, timeout)
    224 
    225 def install_opener(opener):

~\Anaconda3\lib\urllib\request.py in open(self, fullurl, data, timeout)
    524             req = meth(req)
    525 
--> 526         response = self._open(req, data)
    527 
    528         # post-process response

~\Anaconda3\lib\urllib\request.py in _open(self, req, data)
    542         protocol = req.type
    543         result = self._call_chain(self.handle_open, protocol, protocol +
--> 544                                   '_open', req)
    545         if result:
    546             return result

~\Anaconda3\lib\urllib\request.py in _call_chain(self, chain, kind, meth_name, *args)
    502         for handler in handlers:
    503             func = getattr(handler, meth_name)
--> 504             result = func(*args)
    505             if result is not None:
    506                 return result

~\Anaconda3\lib\urllib\request.py in http_open(self, req)
   1344 
   1345     def http_open(self, req):
-> 1346         return self.do_open(http.client.HTTPConnection, req)
   1347 
   1348     http_request = AbstractHTTPHandler.do_request_

~\Anaconda3\lib\urllib\request.py in do_open(self, http_class, req, **http_conn_args)
   1319             except OSError as err: # timeout error
   1320                 raise URLError(err)
-> 1321             r = h.getresponse()
   1322         except:
   1323             h.close()

~\Anaconda3\lib\http\client.py in getresponse(self)
   1329         try:
   1330             try:
-> 1331                 response.begin()
   1332             except ConnectionError:
   1333                 self.close()

~\Anaconda3\lib\http\client.py in begin(self)
    295         # read until we get a non-100 response
    296         while True:
--> 297             version, status, reason = self._read_status()
    298             if status != CONTINUE:
    299                 break

~\Anaconda3\lib\http\client.py in _read_status(self)
    256 
    257     def _read_status(self):
--> 258         line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
    259         if len(line) > _MAXLINE:
    260             raise LineTooLong("status line")

~\Anaconda3\lib\socket.py in readinto(self, b)
    584         while True:
    585             try:
--> 586                 return self._sock.recv_into(b)
    587             except timeout:
    588                 self._timeout_occurred = True

TimeoutError: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond

I downloaded the MNIST dataset and tried to load the data myself instead.我下载了 MNIST 数据集并尝试自己加载数据。 I copied the code used to load the MNIST but I failed to load data again.我复制了用于加载 MNIST 的代码,但我再次加载数据失败。 I thought I need to change some code rather than completely copy the code from Internet but I dont know where I should do the change.(Just a beginner of Python) The code I used to load the downloaded MNIST data.Is it because I put the data in a wrong file?我想我需要更改一些代码而不是完全从 Internet 复制代码,但我不知道我应该在哪里进行更改。(只是 Python 的初学者)我用来加载下载的 MNIST 数据的代码。是因为我把错误文件中的数据?

def loadmnist(imagefile, labelfile):

    # Open the images with gzip in read binary mode
    images = open(imagefile, 'rb')
    labels = open(labelfile, 'rb')

    # Get metadata for images
    images.read(4)  # skip the magic_number
    number_of_images = images.read(4)
    number_of_images = unpack('>I', number_of_images)[0]
    rows = images.read(4)
    rows = unpack('>I', rows)[0]
    cols = images.read(4)
    cols = unpack('>I', cols)[0]

    # Get metadata for labels
    labels.read(4)
    N = labels.read(4)
    N = unpack('>I', N)[0]

    # Get data
    x = np.zeros((N, rows*cols), dtype=np.uint8)  # Initialize numpy array
    y = np.zeros(N, dtype=np.uint8)  # Initialize numpy array
    for i in range(N):
        for j in range(rows*cols):
            tmp_pixel = images.read(1)  # Just a single byte
            tmp_pixel = unpack('>B', tmp_pixel)[0]
            x[i][j] = tmp_pixel
        tmp_label = labels.read(1)
        y[i] = unpack('>B', tmp_label)[0]

    images.close()
    labels.close()
    return (x, y)

Above part is fine.以上部分没问题。

train_img, train_lbl = loadmnist('data/train-images-idx3-ubyte'
                                 , 'data/train-labels-idx1-ubyte')
test_img, test_lbl = loadmnist('data/t10k-images-idx3-ubyte'
                               , 'data/t10k-labels-idx1-ubyte')

Error is like this.错误是这样的。

FileNotFoundError                         Traceback (most recent call last)
<ipython-input-5-b23a5078b5bb> in <module>()
      1 train_img, train_lbl = loadmnist('data/train-images-idx3-ubyte'
----> 2                                  , 'data/train-labels-idx1-ubyte')
      3 test_img, test_lbl = loadmnist('data/t10k-images-idx3-ubyte'
      4                                , 'data/t10k-labels-idx1-ubyte')

<ipython-input-4-967098b85f28> in loadmnist(imagefile, labelfile)
      2 
      3     # Open the images with gzip in read binary mode
----> 4     images = open(imagefile, 'rb')
      5     labels = open(labelfile, 'rb')
      6 

FileNotFoundError: [Errno 2] No such file or directory: 'data/train-images-idx3-ubyte'

The data I downloaded was put in a folder I just made.我下载的数据放在我刚刚制作的文件夹中。 enter image description here在此处输入图片说明

If you want to load the dataset from some library directly rather than downloading it and then loading it, load it from Keras.如果你想直接从某个库中加载数据集而不是下载它然后加载它,请从 Keras 加载它。

It can be done like this可以这样做

from keras.datasets import mnist

(X_train, y_train), (X_test, y_test) = mnist.load_data()

If you are a beginner to Machine Learning and Python who want to know more about it, I recommend you to take a look at this excellent blog post.如果您是机器学习和 Python 的初学者并想了解更多信息,我建议您查看这篇优秀的博客文章。

Also, the extension of the file is also required when passing it to the function.此外,将文件传递给函数时也需要文件的扩展名。 ie you have to call the function like this.即你必须像这样调用函数。

train_img, train_lbl = loadmnist('mnist//train-images-idx3-ubyte.gz'
                                 , 'mnist//train-labels-idx1-ubyte.gz')
test_img, test_lbl = loadmnist('mnist//t10k-images-idx3-ubyte.gz'
                               , 'mnist//t10k-labels-idx1-ubyte.gz')

In the code you are using to load data from the local disk, it throws an error because the file is not present in the given location.在用于从本地磁盘加载数据的代码中,它会引发错误,因为该文件不在给定位置。 Make sure that the folder mnist is present in the folder your notebook is present.确保文件夹 mnist 存在于您的笔记本所在的文件夹中。

服务器已经宕机了一段时间,请参考这个GitHub线程中的一些解决方案,包括从Tensorflow导入或直接从其他来源导入。

You can load it from the sklearn datsets directly.您可以直接从 sklearn 数据集加载它。

from sklearn import datasets
digits = datasets.load_digits()

Or you could load it using Keras.或者您可以使用 Keras 加载它。

from keras.datasets import mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()

Another option is to just download the dataset and load it in with something like pandas.另一种选择是下载数据集并使用诸如熊猫之类的东西加载它。

df = pd.read_csv('filename.csv')

I've faced this Error while I was coding on Spyder (Python 3.7) installed on Anaconda locally.我在本地安装在 Anaconda 上的 Spyder(Python 3.7)上编码时遇到了这个错误。 I've tried many answers and at last I was only able to come across this error by specifying the target file location of the Mnist dataset after downloading it.我已经尝试了很多答案,最后我只能通过在下载后指定 Mnist 数据集的目标文件位置来遇到这个错误。

from scipy.io import loadmat
mnist_path = (r"C:\Users\duppa\Desktop\mnist-original.mat")
mnist_raw = loadmat(mnist_path)
mnist = {
    "data": mnist_raw["data"].T,
    "target": mnist_raw["label"][0],
    "COL_NAMES": ["label", "data"],
    "DESCR": "mldata.org dataset: mnist-original",
        }
mnist

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM