How to carry out hyperparamter Tuning for Multi-layer Perceptron?

Question

I am a beginner in Python. I am trying to tune my MLP model below. I have the following questions:

1)Is this a correct methodology to tune my MLP?

2)After running the code, it keeps giving me long warning in pink before it gives the best parameters, what is this warning? (I've provided the output of my model below).

3)'hidden_layer_sizes': [(100,), (50,100,), (50,75,100,)], I am not sure about the number of hidden layers as well as the number of neutrons in this line of code, do I need to change these numbers? I have 9 inputs and 19 outputs, if I need to change these numbers, then what are the best numbers of hidden layers and neutrons to replace these numbers?

Thank you in advance!

from sklearn.neural_network import MLPRegressor
from sklearn.model_selection import GridSearchCV
X, y = scaled_df[[ "Part's Z-Height (mm)","Part's Weight (N)","Part's Volume (cm^3)","Part's Surface Area (cm^2)","Part's Orientation (Support's height) (mm)","Part's Orientation (Support's volume) (cm^3)","Layer Height (mm)","Printing/Scanning Speed (mm/s)","Infill Density (%)"]], scaled_df [["Climate change (kg CO2 eq.)","Climate change, incl biogenic carbon (kg CO2 eq.)","Fine Particulate Matter Formation (kg PM2.5 eq.)","Fossil depletion (kg oil eq.)","Freshwater Consumption (m^3)","Freshwater ecotoxicity (kg 1,4-DB eq.)","Freshwater Eutrophication (kg P eq.)","Human toxicity, cancer (kg 1,4-DB eq.)","Human toxicity, non-cancer (kg 1,4-DB eq.)","Ionizing Radiation (Bq. C-60 eq. to air)","Land use (Annual crop eq. yr)","Marine ecotoxicity (kg 1,4-DB eq.)","Marine Eutrophication (kg N eq.)","Metal depletion (kg Cu eq.)","Photochemical Ozone Formation, Ecosystem (kg NOx eq.)","Photochemical Ozone Formation, Human Health (kg NOx eq.)","Stratospheric Ozone Depletion (kg CFC-11 eq.)","Terrestrial Acidification (kg SO2 eq.)","Terrestrial ecotoxicity (kg 1,4-DB eq.)"]]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)

mlp = MLPRegressor()
param_grid = {'hidden_layer_sizes': [(100,), (50,100,), (50,75,100,)],
              'activation': ['tanh','relu','lbfgs'],
              'solver': ['sgd', 'adam'],
              'learning_rate': ['constant','adaptive','invscaling'],
              'alpha': [0.0001, 0.05],
              'max_iter': [10000000000],
              'early_stopping': [False],
              'warm_start': [False]}
GS = GridSearchCV(mlp, param_grid=param_grid,n_jobs= -1,cv=5, scoring='r2')
                  
                  
GS.fit(X_train, y_train)

print(GS.best_params_)

Model Output:

/Users/opt/anaconda3/lib/python3.9/site-packages/sklearn/model_selection/_validation.py:615: FitFailedWarning: Estimator fit failed. The score on this train-test partition for these parameters will be set to nan. Details: 
Traceback (most recent call last):
  File "/Users/opt/anaconda3/lib/python3.9/site-packages/sklearn/model_selection/_validation.py", line 598, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "/Users/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py", line 673, in fit
    return self._fit(X, y, incremental=False)
  File "/Users/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py", line 357, in _fit
    self._validate_hyperparameters()
  File "/Users/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py", line 448, in _validate_hyperparameters
    raise ValueError("The activation '%s' is not supported. Supported "
ValueError: The activation 'lbfgs' is not supported. Supported activations are ['identity', 'logistic', 'relu', 'softmax', 'tanh'].

  
/Users/opt/anaconda3/lib/python3.9/site-packages/sklearn/model_selection/_validation.py:615: FitFailedWarning: Estimator fit failed. The score on this train-test partition for these parameters will be set to nan. Details: 
Traceback (most recent call last):
  File "/Users/opt/anaconda3/lib/python3.9/site-packages/sklearn/model_selection/_validation.py", line 598, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "/Users/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py", line 673, in fit
    return self._fit(X, y, incremental=False)
  File "/Users/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py", line 357, in _fit
    self._validate_hyperparameters()
  File "/Users/opt/anaconda3/lib/python3.9/site-packages/sklearn/neural_network/_multilayer_perceptron.py", line 448, in _validate_hyperparameters
    raise ValueError("The activation '%s' is not supported. Supported "
ValueError: The activation 'lbfgs' is not supported. Supported activations are ['identity', 'logistic', 'relu', 'softmax', 'tanh'].

  

  warnings.warn("Estimator fit failed. The score on this train-test"
/Users/opt/anaconda3/lib/python3.9/site-packages/sklearn/model_selection/_search.py:922: UserWarning: One or more of the test scores are non-finite: [-1.77668202e+01 -5.36865257e+00 -2.36108673e+01 -4.87857020e+00
 -5.37871220e+01 -7.33847388e+00 -1.18104844e+01 -4.31770716e+00
 -1.07982957e+01 -5.90410809e+00 -6.90785910e+01 -3.52662014e+00
 -1.28640696e+01 -2.08943515e+00 -8.90821473e+00 -3.54330679e+00
 -6.41294642e+01 -3.43469060e+00 -2.03283528e+01 -7.13374531e+00
 -1.94873065e+01 -6.80254241e+00 -8.26138602e+01 -3.60145086e+00
 -1.44247785e+01  1.82313539e-01 -1.27466817e+01  1.42093292e-02
 -8.58307682e+01  2.04012758e-01 -4.14186486e+00  6.44037466e-01
 -9.19520894e+00  5.57317462e-01 -7.21849359e+01  7.55050142e-01
 -2.59194519e+01 -3.13963793e+00 -3.21942292e+01 -3.62738244e+00
 -4.22512000e+01 -2.97608298e+00 -1.82051179e+01 -3.43752161e+00
 -1.99742744e+01 -2.04712456e+00 -6.01852277e+01 -4.21323595e+00
 -2.38904146e+01 -1.14621391e+00 -1.68763533e+01 -4.01997962e-02
 -5.93074730e+01 -3.83233190e-01 -2.86875402e+01 -2.29876280e+00
 -3.05378372e+01 -1.79326275e+00 -4.12499389e+01 -3.32545463e+00
 -2.34686687e+01  1.92512129e-01 -1.40481244e+01 -2.00564895e-01
 -2.84224859e+01  3.25566168e-01 -1.77123178e+01  7.28578976e-01
 -2.35861764e+01  5.12092072e-01 -3.44553330e+01  6.87623634e-01
             nan             nan             nan             nan
             nan             nan             nan             nan
             nan             nan             nan             nan
             nan             nan             nan             nan
             nan             nan             nan             nan
             nan             nan             nan             nan
             nan             nan             nan             nan
             nan             nan             nan             nan
             nan             nan             nan             nan]


'{'activation': 'tanh', 'alpha': 0.05, 'early_stopping': False, 'hidden_layer_sizes': (50, 75, 100), 'learning_rate': 'invscaling', 'max_iter': 10000000000, 'solver': 'adam', 'warm_start': False}

Answer 1

Your approach is ok, however, it's hard to know the right number of layers/neurons before hand. It is really problem dependent.

Grid search as you are using is an option, specially to find the order of magnitude of the parameters (10, 100, 1000). Then people often use RandomizedSearchCV to refine the search around the best values found in previous steps.

Note that there are also some more advanced tools, like KerasTuner , which allows to perform hyper parameter search applying more complicated strategies, like using bayesian optimization over the hyper parameter space, but not sure if those are compatible with sklearn.

That said, yours warning/ error messages are because some values you are trying to validate are not available in the sklearn version you are using or they just don't exists. Consider also that all the combinations of hyper parameters you are validating must be coherent (for instance, if docs says that for param a=0 then b must be non-negative, you must ensure that each point of your param grid meets that condition).

So, in summary, double check the documentation toake sure your hyperparam grid is ok, and read your stacktrace carefully because some errors are described there (like the unavailable activation lbsgf ).

How to carry out hyperparamter Tuning for Multi-layer Perceptron?

Question

1 answers

solution1
1 2022-07-16 11:52:25

How to carry out hyperparamter Tuning for Multi-layer Perceptron?

Question

1 answers

solution1 1 2022-07-16 11:52:25

solution1
1 2022-07-16 11:52:25