简体   繁体   中英

Keras fit_generator doesn't see all of the data

I am trying to train a convolutional neural netwrok for a tables of 0 and 1. The size of the image is 7*6 and the number of filters are 15. Since the dataset is rather large, I am trying to train it with fit_generator functions. Here is the code

model.fit_generator(
generator=data_generator(ml_mode='train'),
samples_per_epoch=12800,#int(self.num_points['train']['pos'] * 2 * self.configs['training']['train_data_fraction_per_epoch']),
nb_epoch=100,
callbacks=None,
verbose=1,
validation_data=data_generator(ml_mode='test'),
nb_val_samples = 1280,
initial_epoch=0,
class_weight = my_class_weight
)

and here is the implementation of my data_generator:

model.fit_generator(
generator=data_generator(ml_mode='train'),
samples_per_epoch=12800,#int(self.num_points['train']['pos'] * 2 * self.configs['training']['train_data_fraction_per_epoch']),
nb_epoch=100,
callbacks=None,
verbose=1,
validation_data=data_generator(ml_mode='test'),
nb_val_samples = 1280,
initial_epoch=0,
class_weight = my_class_weight
)

I put prints to check how many lines it reads and I got the count of 25k while my data is 130k lines. Can anybody help me understand what is the potential problem here or what am I missing?

Thanks

I have solved it myself and I think it might be useful to share what I have observed.

samples_per_epoch is the number of times that algorithm continue from yield before forming data. I fixed it by setting this parameter to my batch_size and nb_epoch to the value of data_size/ batch_size.

Keras reads 10 examples first then it tries to find a best available core which was a little confusing to me.

Best

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM