简体   繁体   中英

Google Colab TPU: TF.data and TF.keras not working

I'm using Talos and Google colab TPU to run hyperparameter tuning of a Keras model. I'm using Tensorflow 2.0.0 and Keras 2.2.4-tf:

import os
import tensorflow as tf
import talos as ta
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

def iris_model(x_train, y_train, x_val, y_val, params):
    # Specify a distributed strategy to use TPU
    resolver = tf.distribute.cluster_resolver.TPUClusterResolver(tpu='grpc://' + os.environ['COLAB_TPU_ADDR'])
    tf.config.experimental_connect_to_host(resolver.master())
    tf.tpu.experimental.initialize_tpu_system(resolver)
    strategy = tf.distribute.experimental.TPUStrategy(resolver)

    # Use the strategy to create and compile a Keras model
    with strategy.scope():
      model = Sequential()
      model.add(Dense(32, input_dim=4, activation=params['activation']))
      model.add(Dense(3, activation='softmax'))
      model.compile(optimizer=params['optimizer'], loss=params['losses'])

    # Convert the train set to a Dataset to use TPU
    dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
    dataset = dataset.cache().shuffle(1000, reshuffle_each_iteration=True).repeat().batch(params['batch_size'], drop_remainder=True)

    # Fit the Keras model on the dataset
    out = model.fit(dataset, batch_size=params['batch_size'], epochs=params['epochs'], validation_data=[x_val, y_val], verbose=0)

    return out, model
x, y = ta.templates.datasets.iris()

# Create a hyperparameter distributions
p = {'activation': ['relu', 'elu'],
       'optimizer': ['Nadam', 'Adam'],
       'losses': ['logcosh'],
       'batch_size': (20, 50, 5),
       'epochs': [10, 20]}

# Use Talos to scan the best hyperparameters of the Keras model
scan_object = ta.Scan(x, y, model=iris_model, params=p, fraction_limit=0.1, experiment_name='first_test')

After converting the train set to a Dataset using tf.data.Dataset , I get the following error when fitting the Keras model with out = model.fit :

/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/training_distributed.py in fit(self, model, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_freq, **kwargs)
    609         validation_split=validation_split)
    610     batch_size = model._validate_or_infer_batch_size(
--> 611         batch_size, steps_per_epoch, x)
    612     dataset = model._distribution_standardize_user_data(
    613         x, y,

/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/training.py in _validate_or_infer_batch_size(self, batch_size, steps, x)
   1815             'The `batch_size` argument must not be specified for the given '
   1816             'input type. Received input: {}, batch_size: {}'.format(
-> 1817                 x, batch_size))
   1818       return
   1819 

ValueError: The `batch_size` argument must not be specified for the given input type. Received input: <BatchDataset shapes: ((38, 4), ((38, 3)), types: (tf.float64, tf.float32)>, batch_size: 38

Replace :

out = model.fit(dataset, batch_size=params['batch_size'], epochs=params['epochs'], validation_data=[x_val, y_val], verbose=0)

by :

out = model.fit(dataset, epochs=params['epochs'], validation_data=[x_val, y_val], verbose=0)

I think this will solve your problem

Please try with Tensorflow 2.1 or 2.2.

https://colab.research.google.com/notebooks/tpu.ipynb make sure to change accelerator to a TPU: runtime -> change run time -> TPU

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM