简体   繁体   中英

Using MNIST data with Keras

I am currently playing around with MNIST data as part of course on using numpy and tensorflow. I was running the code they provided in course and I noticed a few warnings from tensorflow when running this snippet of code:

from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets("../data/mnist_data/", one_hot=True)

I looked into the documentation and read that this is deprecated and one should use MNIST from keras instead. So I changed the above code to this

from keras.datasets import mnist
from keras.models import Sequential, load_model
from keras.layers.core import Dense, Dropout, Activation
from keras.utils import np_utils

(X_train, y_train), (X_test, y_test) = mnist.load_data()

my issue now is that in the course material they use this function:

training_digits, training_labels = mnist.train.next_batch(5000)

that function next_batch() isn't available with keras and the original MNIST dataset is pretty large. Is there a clever way to this with keras?

Many thanks in advance!

您可以设置 batch_size 并使用单次迭代器(),如此处描述的Keras Mnist 文档

Use Sequential() from Keras. This Sequential() has a method called fit(), there you can set batchSize in parameter.Seee the documentatuion: keras Sequential

The issue is that your tutorial is using a different API from the keras dataset API used in most current tutorials. In using the keras.dataset API you are trying to 'cross the streams'.

You (broadly) have three options:

Option 1

Just stick with your existing tutorial and ignore the deprecation warnings. Super straightforward but you may miss out on the benefits of the keras api (the new default) unless you intend to learn this later

Option 2

Switch entirely to the keras API and find a new tutorial. This one is an MNIST example in just a few lines of code:

mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])


model.fit(x_train, y_train, epochs=5)

model.evaluate(x_test, y_test)

If it's available to you, this is the option I'd recommend. keras is the new default. Perhaps this isn't an option or you want to stick with your original course but I'd certainly recommend becoming familiar with keras soon.

Option 3

Find a way to successfully 'cross the streams'.

This is more tricky but certainly can be done. The keras.dataset for mnist is just a big array after all. You could look into the Dataset API (in particular load_from_tensor() and load_from_tensor_slices() ). These options would need a little bit of wrangling though because inherently (as you discovered) the dataset returned from the new method is a different type from that returned from the old ones.

UPDATE:

The link in nag's answer provides a comprehensive example of doing this which I was unaware of previously!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM