简体繁体中英

In Keras, what is the best between using validation_split (in “fit” method) and model.evaluate function?

原文 2021-02-08 06:19:06 7 1 python/ tensorflow/ keras

In Keras there are two ways (at least) to split the data and display loss/accuracy:

In Keras fit function there is the validation_split option that allows to split the dataset into training and testing sets AND to display loss/accuracy values during the training.
Another way to split the data is to split it at the begining of the code (using train_test_split function for instance), train the data with the training set THEN use model.evaluate to test the test data.

I realize that the accuracy with the second method is generally lower (maybe is it more realistic?) and I am wondering which method should be definitely prefered on the other , I would think that method 2) being less optimistic would be preferable.

You can play with my code here , to use the 1st method, uncomment lines where written "HERE1" (and comment lines with "HERE2") and to use the 2nd method do the opposite.

Any suggestion?

Best regards

Aymeric

1 answers

The best way to see your validation results would be to split your training and validation data into equal amounts of each class. This can be done using StratifiedKFold from sklearn.model_selection .

When I looked through the docs for tf.keras.Model , I found that "The validation data is selected from the last samples in the x and y data provided, before shuffling". Looking further into the last 30% of your training data, I found that 156 out of the 186 samples contained the positive class, leaving just 30 samples as the negative class. In the data that you split afterwards, only 115 of the samples were positive out of the 268. Note that the difference in the total number of samples is due to splitting the training dataset in model.fit() rather than on the entire dataset before training. This large imbalance could be a contributor towards the large variation in accuracy.

Due to this, I would recommend almost never using the validation_split parameter because it is easier to organize your data beforehand than checking the last x% of your training data during training.

Keras model.evaluate()

Keras fit using validation_split gets higher results than using validation_data

How to use `evaluate` after `fit` with `validation_split` parameter?

keras's model.fit() is not consistent with model.evaluate()

What is the difference between CSVLogger and model.evaluate()?

What happens when I iuse validation_split in model.fit()

Keras ImageDataGenerator validation_split

Keras model.evaluate() failing

Why does Keras gives me different results between model.evaluate, model.predicts & model.fit?

Get train/test data after using model.fit() with validation_split

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Keras model.evaluate() Keras fit using validation_split gets higher results than using validation_data How to use `evaluate` after `fit` with `validation_split` parameter? keras's model.fit() is not consistent with model.evaluate() What is the difference between CSVLogger and model.evaluate()? What happens when I iuse validation_split in model.fit() Keras ImageDataGenerator validation_split Keras model.evaluate() failing Why does Keras gives me different results between model.evaluate, model.predicts & model.fit? Get train/test data after using model.fit() with validation_split

Related Tags

In Keras, what is the best between using validation_split (in “fit” method) and model.evaluate function?

Question

1 answers

solution1 0 ACCPTED 2021-02-08 06:53:13

solution1
0 ACCPTED 2021-02-08 06:53:13