简体   繁体   English

如何使用 KerasTuner 调整神经网络架构?

[英]How can I tune neural network architecture using KerasTuner?

I'm trying to use KerasTuner to automatically tune the neural network architecture, ie, the number of hidden layers and the number of nodes in each hidden layer.我正在尝试使用 KerasTuner 自动调整神经网络架构,即隐藏层数和每个隐藏层中的节点数。 Currently, the neural network architecture is defined using one parameter NN_LAYER_SIZES .目前,神经网络架构是使用一个参数NN_LAYER_SIZES定义的。 For example,例如,

NN_LAYER_SIZES = [128, 128, 128, 128]

indicates the NN has 4 hidden layers and each hidden layer has 128 nodes.表示 NN 有 4 个隐藏层,每个隐藏层有 128 个节点。

KerasTuner has the following hyperparameter types ( https://keras.io/api/keras_tuner/hyperparameters/ ): KerasTuner 具有以下超参数类型( https://keras.io/api/keras_tuner/hyperparameters/ ):

  • Int诠释
  • Float漂浮
  • Boolean Boolean
  • Choice选择

It seems none of these hyperparameter types fits my use case.似乎这些超参数类型都不适合我的用例。 So I wrote the following code to scan the number of hidden layers and the number of nodes.所以我写了下面的代码来扫描隐藏层的数量和节点的数量。 However, it's not been recognized as a hyperparameter.但是,它未被识别为超参数。

number_of_hidden_layer = hp.Int("layer_number", min_value=2, max_value=5, step=1)
number_of_nodes = hp.Int("node_number", min_value=4, max_value=8, step=1)
NN_LAYER_SIZES = [2**number_of_nodes for _ in range(number of hidden_layer)]

Any suggestions on how to make it right?关于如何使它正确的任何建议?

Maybe treat the number of layers as a hyperparameter by iterating through it when building your model.在构建 model 时,可能通过迭代将层数视为超参数。 That way you can experiment with different numbers of layers combined with different numbers of nodes:这样你就可以尝试不同数量的层和不同数量的节点:

import tensorflow as tf
import keras_tuner as kt

def model_builder(hp):
  model = tf.keras.Sequential()
  model.add(tf.keras.layers.Flatten(input_shape=(28, 28)))

  units = hp.Int('units', min_value=32, max_value=512, step=32)
  layers = hp.Int('layers', min_value=2, max_value=5, step=1)

  for _ in range(layers):
    model.add(tf.keras.layers.Dense(units=units, activation='relu')) 

  model.add(tf.keras.layers.Dense(10))

  model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
                loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
                metrics=['accuracy'])
  return model

(img_train, label_train), (_, _) = tf.keras.datasets.fashion_mnist.load_data()
img_train = img_train.astype('float32') / 255.0

tuner = kt.Hyperband(model_builder,
                     objective='val_accuracy',
                     max_epochs=10,
                     factor=3)

tuner.search(img_train, label_train, epochs=50, validation_split=0.2)
best_hps=tuner.get_best_hyperparameters(num_trials=1)[0]

model = tuner.hypermodel.build(best_hps)
history = model.fit(img_train, label_train, epochs=50, validation_split=0.2)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM