I have a dataset that is too large to fit on RAM so I opted to use train_on_batch to train my model incrementally. To test if this approach works, I took a subset of my large data to run some preliminary testing.
However, I have been having some issues training the model, namely the accuracy of the model gets stuck at 10% when training with train_on_batch(). With fit(), I get an accuracy of 95% at 40 epochs. I have also tried fit_generator() and have encountered similar issues.
using fit()
results = model.fit(x_train,y_train,batch_size=128,nb_epoch=40)
using train_on_batch()
#386 has been chosen so that each batch size is 128
splitSize = len(y_train) // 386
for j in range(20):
print('epoch: '+str(j)+' ----------------------------')
np.random.shuffle(x_train)
np.random.shuffle(y_train)
xb = np.array_split(x_train,386)
yb = np.array_split(y_train,386)
sumAcc = 0
index = list(range(386))
random.shuffle(index)
for i in index:
results = model.train_on_batch(xb[i],yb[i])
sumAcc += results[1]
print(sumAcc/(386))
The shuffle you are using is incorrect, because the y_train does not match x_train after the shuffle. When you shuffle like that, each array is shuffled in a different order. You can use:
length = x_train.shape[0]
idxs = np.arange(0, length)
np.random.shuffle(idxs)
x_train = x_train[idxs]
y_train = y_train[idxs]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.