简体   繁体   中英

Criteria for nb_epoch, samples_per_epoch, and nb_val_samples in keras fit_generator?

I have created a simple cat and dog image classification (convolution neural network). Having training data of 7,000 each class and validation data of 5,500 each class.

My problem is my system is not completing all epoch . I would really appreciate if someone could explain the proportion or criteria of selecting nb_epoch, samples_per_epoch and nb_val_samples values to get maximum out of given amount of training and validation data.

Following is my code:

from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Convolution2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras.callbacks import EarlyStopping
import numpy as np
from keras.preprocessing import image
from keras.utils.np_utils import probas_to_classes

model=Sequential()
model.add(Convolution2D(32, 5,5, input_shape=(28,28,3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Convolution2D(32,3,3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.5))

model.add(Dense(2))
model.add(Activation('softmax'))

train_datagen=ImageDataGenerator(rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen=ImageDataGenerator(rescale=1./255)

train_generator=train_datagen.flow_from_directory(
r'F:\data\train',
target_size=(28,28),
classes=['dog','cat'],
batch_size=10,
class_mode='categorical',
shuffle=True)

validation_generator=test_datagen.flow_from_directory(
r'F:\data\validation',
target_size=(28, 28),
classes=['dog','cat'],
batch_size=10,
class_mode='categorical',
shuffle=True)

model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
early_stopping=EarlyStopping(monitor='val_loss', patience=2)
model.fit_generator(train_generator,verbose=2, samples_per_epoch=650, nb_epoch=100, validation_data=validation_generator, callbacks=[early_stopping],nb_val_samples=550)

json_string=model.to_json()
open(r'F:\data\mnistcnn_arc.json','w').write(json_string)
model.save_weights(r'F:\data\mnistcnn_weights.h5')
score=model.evaluate_generator(validation_generator, 1000)

print('Test score:', score[0])
print('Test accuracy:', score[1])

img_path = 'F:/abc.jpg'
img = image.load_img(img_path, target_size=(28, 28))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)

y_proba = model.predict(x)
y_classes = probas_to_classes(y_proba)
print(train_generator.class_indices)
print(y_classes)

samples_per_epoch is usually setted as:

samples_per_epoch=train_generator.nb_samples

This way you are ensuring that every epoch you are seeing a number of samples equal to the size of your training set. This means that you are seeing all of your training samples at every epoch.


nb_epoch is pretty much up to you. It determines how many times you iterate over a number defined by samples_per_epoch .

To give you an example, in your code right now your model is 'seeing' (nb_epoch * samples_per_epoch) images, wich in this case are 65000 images.


nb_val_samples determines over how many validation samples your model is evaluated after finishing every epoch. It is up to you aswell. The usual thing is to set:

nb_val_samples=validation_generator.nb_samples

In order to evaluate your model on the full validation set.


batch_size determines how many images are feeded at the same time to your gpu (or cpu). Rule of dumb is to set the largest batch_size that your gpu's memmory allows. Ideal batch_size is active area of research nowdays, but usually a bigger batch_size will work better.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM