简体   繁体   中英

Score remains same during hyper parameter tuning

My model-

    model = Sequential()
    model.add(Dense(128, activation='relu', input_dim=n_input_1))
    model.add(Dense(64, activation='relu'))
    #model.add(Dense(32, activation='relu'))
    #model.add(Dense(16, activation='relu'))
    model.add(Dense(1))
    model.compile(optimizer='adam', loss='mse',metrics=['mse'])

Now I am doing hyper parameter tuning but its coming the same for every possible result-

Best: -61101.514139 using {'batch_size': 10, 'epochs': 2}
-61101.514139 (25108.783936) with: {'batch_size': 10, 'epochs': 2}
-61101.514139 (25108.783936) with: {'batch_size': 10, 'epochs': 4}
-61101.514139 (25108.783936) with: {'batch_size': 10, 'epochs': 5}
-61101.514139 (25108.783936) with: {'batch_size': 10, 'epochs': 10}
-61101.514139 (25108.783936) with: {'batch_size': 10, 'epochs': 15}
-61101.514139 (25108.783936) with: {'batch_size': 20, 'epochs': 2}
-61101.514139 (25108.783936) with: {'batch_size': 20, 'epochs': 4}
-61101.514139 (25108.783936) with: {'batch_size': 20, 'epochs': 5}
-61101.514139 (25108.783936) with: {'batch_size': 20, 'epochs': 10}
-61101.514139 (25108.783936) with: {'batch_size': 20, 'epochs': 15}
-61101.514139 (25108.783936) with: {'batch_size': 30, 'epochs': 2}
-61101.514139 (25108.783936) with: {'batch_size': 30, 'epochs': 4}
-61101.514139 (25108.783936) with: {'batch_size': 30, 'epochs': 5}
-61101.514139 (25108.783936) with: {'batch_size': 30, 'epochs': 10}
-61101.514139 (25108.783936) with: {'batch_size': 30, 'epochs': 15}

This is the first time I am doing hyper parameter and this has stumped me. I can provide additional details if needed. What is the reason for this possible behavior?

I am doing time series forecasting using MLP. I have used 'neg_mean_absolute_error as the scoring function in gridsearchCV.

edit- this is what Im running-

from sklearn.model_selection import GridSearchCV
# fix random seed for reproducibility
seed = 7
np.random.seed(seed)

# define the grid search parameters
model = KerasClassifier(build_fn=create_model, verbose=1)
batch_size = [10,20,2000]
epochs = [2,4,5,10, 25]
param_grid = dict(batch_size=batch_size, epochs=epochs)
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3,scoring='neg_mean_squared_error')
grid_result = grid.fit(scaled_train,scaled_train_y)
# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

It seems like you are not giving enough range for those variable to see the difference. For neural network there are lots of hyper parameter to tune so I will give a brief explanation for some variable and what it can do.

1) batch size , let say we have 1 million example for our machine to learn and we want our model to see all the data set by not throwing some example away so we increase batch size so that we update our weight once after seeing batch size number of sample. Therefore, increase this means we lose data efficiency (see many sample update once) but gaining diversity of sample.

2) epoch , means when our model train with all given data, we count as 1 epoch. So if we increase batch size we will update our weight couple times for each epoch.

3) learning rate , this number shows how much do we update weight of model per iteration, too high, your loss will bouncing around or shooting up, too low your loss will reduce super slow.

So that you are doing is varying epoch and batch size which you probably see no reduction in loss because normally, what people do is to train the model for couple hundred or thousand epochs so you can see the difference in loss. I would advice you to play with learning rate and fix every other parameter and run it for 100 epochs then you will see the difference. Moreover, you don't have to vary the epoch because you can run it once and collect loss in every epoch and compare it with other experiment.

here is the link if you want to learn more about what parameter can do.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM