简体   繁体   English

如何在python中加载我自己的数据或在线数据集以训练CNN或自动编码器?

[英]how to load my own data or online dataset in python for training CNN or autoencoder?

I'm trouble in a simple problem during loading dataset in python. 在python中加载数据集期间,我遇到一个简单的问题。 I want to define function called loading_dataset() to use it in training auto encoder my code is 我想定义一个名为loading_dataset()函数以在训练自动编码器中使用它,我的代码是

import matplotlib
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
from urllib import urlretrieve
import cPickle as pickle
import os

import gzip
rom urllib import urlretrieve
import cPickle as pickle
import os
import gzip
import matplotlib.cm as cm
import theano
import lasagne
from lasagne import layers
from lasagne.updates import nesterov_momentum
from nolearn.lasagne import NeuralNet
from nolearn.lasagne import visualize
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
#############################I tried to load data from open source
def load_dataset():
    url = 'ftp://ftp nrg.wustl.edu/data/oasis_cross-sectional_disc2.tar.gz'
    filename ='oasis_cross-sectional_disc2.tar.gz'
    if not os.path.exists(filename):
        print("Downloading MNIST dataset...")
        urlretrieve(url, filename)
    with gzip.open(filename, 'rb') as f:
        data = pickle.load(f)
    X_train, y_train = data[0]
        X_val, y_val = data[1]
        X_test, y_test = data[2]
        X_train = X_train.reshape((-1, 1, 28, 28))
        X_val = X_val.reshape((-1, 1, 28, 28))
        X_test = X_test.reshape((-1, 1, 28, 28))
        y_train = y_train.astype(np.uint8)
        y_val = y_val.astype(np.uint8)
        y_test = y_test.astype(np.uint8)
        return X_train, y_train, X_val, y_val, X_test, y_test
X_train, y_train, X_val, y_val, X_test, y_test = load_dataset()

downloading MNIST dataset... 正在下载MNIST数据集...

Traceback (most recent call last):
  File "<pyshell#46>", line 1, in <module>
    X_train, y_train, X_val, y_val, X_test, y_test = load_dataset()
  File "<pyshell#45>", line 6, in load_dataset
    urlretrieve(url, filename)
  File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 98, in urlretrieve
    return opener.retrieve(url, filename, reporthook, data)
  File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 245, in retrieve
    fp = self.open(url, data)
  File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 213, in open
    return getattr(self, name)(url)
  File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 526, in open_ftp
    host = socket.gethostbyname(host)
IOError: [Errno socket error] [Errno 8] nodename nor servname provided, or not known

this error appeared 出现此错误

I also tried to load data from my desktop using this code for path, dirs, files in os.walk(pat): for filename in files: fullpath = os.path.join(path, filename) with open(fullpath, 'r') as f: s=np.load(f) data = f.read() print data 我还尝试使用以下代码从我的桌面加载数据,以用于路径,目录,os.walk(pat)中的文件:用于文件中的文件名:fullpath = os.path.join(path,filename)with open(fullpath,'r ')作为f:s = np.load(f)data = f.read()打印数据

but I failed to load data as values for X_train, y_train, X_val, y_val, X_test, y_test I don't know if I should compress dataset in .pkl.gz or use different function for loading data could you help me? 但是我无法将数据加载为X_train,y_train,X_val,y_val,X_test,y_test的值,我不知道是否应该压缩.pkl.gz中的数据集或使用其他函数来加载数据,您能帮我吗?

If you can use keras to build network, here is the way to load mnist dataset 如果可以使用keras构建网络,则可以使用以下方法加载mnist数据集

import keras
from keras.datasets import mnist
from keras.layers import Dense, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.models import Sequential

load the MNIST data set, which already splits into train and test sets for us 加载MNIST数据集,该数据集已经为我们分为训练集和测试集

(x_train, y_train), (x_test, y_test) = mnist.load_data() (x_train,y_train),(x_test,y_test)= mnist.load_data()

if you receive any error of downloading the dataset, download dataset from https://s3.amazonaws.com/img-datasets/mnist.npz and put it in the folder called ~/.keras/dataset 如果您在下载数据集时遇到任何错误,请从https://s3.amazonaws.com/img-datasets/mnist.npz下载数据集并将其放在名为〜/ .keras / dataset的文件夹中

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM