Keras fit using validation_split gets higher results than using validation_data

Question

I am using the following fit function:

history = model.fit(x=[X1_train, X2_train, X3_train],
                y=y_train,
                batch_size=50,
                epochs=20,
                verbose=2,
                validation_split=0.3,
                #validation_data=([X1_test, X2_test, X3_test], y_test),
                class_weight={0:1, 1:10})

and getting average val_acc of 0.7. But when running again, this time with the validation_data option (using data from the same dataset that I kept aside, of size around 30% of train data) I am getting an average val_acc of 0.35. Any reasons for getting such differences?

Answer 1

As requested by the OP, I am posting my comment as an answer and try to elaborate more:

When you set the validation_split argument, the validations samples are selected from the last samples in the training data and labels (ie X_train and y_train ). Now, in this specific case, if the proportion of class labels in these selected samples is not the same as the proportion of the class labels in the data you provide using validation_data argument, then you should not necessarily expect the validation loss to be the same in these two cases. And that's simply because your model may have different accuracy on each of the classes.

Keras fit using validation_split gets higher results than using validation_data

Question

1 answers

solution1
0 ACCPTED 2018-09-29 20:47:59

Keras fit using validation_split gets higher results than using validation_data

Question

1 answers

solution1 0 ACCPTED 2018-09-29 20:47:59

solution1
0 ACCPTED 2018-09-29 20:47:59