簡體   English   中英

自定義張量流模型上的 InvalidArgumentError

[英]InvalidArgumentError on custom tensorflow model

作為TPS 七月挑戰的一部分,我試圖實現一個基於循環神經網絡的自定義張量流模型

想法:我想包含一個 RNN,它根據模型在前一次迭代中的預測來預測當前迭代中的值。 所以,我實現了一個自定義模型,它保存當前迭代的輸出,在下一次迭代中饋送到模型的 LSTM 層。

但是,如果我調用模型的 fit 方法,則會出現以下錯誤

InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-29-0457da000b62> in <module>
----> 1 model.fit(train_x,train_labels,epochs=100)

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing)
   1098                 _r=1):
   1099               callbacks.on_train_batch_begin(step)
-> 1100               tmp_logs = self.train_function(iterator)
   1101               if data_handler.should_sync:
   1102                 context.async_wait()

/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py in __call__(self, *args, **kwds)
    826     tracing_count = self.experimental_get_tracing_count()
    827     with trace.Trace(self._name) as tm:
--> 828       result = self._call(*args, **kwds)
    829       compiler = "xla" if self._experimental_compile else "nonXla"
    830       new_tracing_count = self.experimental_get_tracing_count()

/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py in _call(self, *args, **kwds)
    886         # Lifting succeeded, so variables are initialized and we can run the
    887         # stateless function.
--> 888         return self._stateless_fn(*args, **kwds)
    889     else:
    890       _, _, _, filtered_flat_args = \

/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/function.py in __call__(self, *args, **kwargs)
   2941        filtered_flat_args) = self._maybe_define_function(args, kwargs)
   2942     return graph_function._call_flat(
-> 2943         filtered_flat_args, captured_inputs=graph_function.captured_inputs)  # pylint: disable=protected-access
   2944 
   2945   @property

/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/function.py in _call_flat(self, args, captured_inputs, cancellation_manager)
   1917       # No tape is watching; skip to running the function.
   1918       return self._build_call_outputs(self._inference_function.call(
-> 1919           ctx, args, cancellation_manager=cancellation_manager))
   1920     forward_backward = self._select_forward_and_backward_functions(
   1921         args,

/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/function.py in call(self, ctx, args, cancellation_manager)
    558               inputs=args,
    559               attrs=attrs,
--> 560               ctx=ctx)
    561         else:
    562           outputs = execute.execute_with_cancellation(

/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     58     ctx.ensure_initialized()
     59     tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
---> 60                                         inputs, attrs, num_outputs)
     61   except core._NotOkStatusException as e:
     62     if name is not None:

InvalidArgumentError:  Can not squeeze dim[0], expected a dimension of 1, got 32
     [[{{node lstm_model/weight_normalization_15/cond/else/_1/lstm_model/weight_normalization_15/cond/data_dep_init/moments/Squeeze}}]] [Op:__inference_train_function_11775]

Function call stack:
train_function

我使用模型的方法是否正確? 如果沒有,我的想法更好的實現是什么?

我的代碼:

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import tensorflow_addons as tfa

train_labels = train_data[['target_carbon_monoxide','target_benzene','target_nitrogen_oxides']].copy()
train_x = train_data.drop(['target_carbon_monoxide','target_benzene','target_nitrogen_oxides','date_time'],axis=1)
train_x.head()
train_labels = np.asarray(train_labels).reshape(-1,1,3)


curroutput = tf.Variable(shape=(1,3),initial_value=[[0.0,0.0,0.0]],dtype=tf.float32)
class CompleteModel(keras.Model):
    def train_step(self, data):
        x,y = data
#         x = tf.reshape(self.curroutput,shape=(1,1,3))
        
        with tf.GradientTape() as tape:
            y_pred = self(x, training=True)
            loss = self.compiled_loss(y_pred,y, regularization_losses=self.losses)

        global curroutput
        curroutput.assign(y_pred)
        trainable_vars = self.trainable_variables
        gradients = tape.gradient(loss, trainable_vars)
        self.optimizer.apply_gradients(zip(gradients, trainable_vars))
        self.compiled_metrics.update_state(y, y_pred)
        return {m.name: m.result() for m in self.metrics}



class RNNInputLayer(keras.layers.Layer):
    def __init__(self):
        super(RNNInputLayer,self).__init__()
    def call(self,inputs):
        global curroutput
        return tf.reshape(curroutput,shape=(1,1,3))



def make_model():
    input_layer = layers.Input(shape=8,batch_size=1)
    dense_in = tfa.layers.WeightNormalization(layers.Dense(16,activation='selu'))(input_layer)
    dense_in2 = tfa.layers.WeightNormalization(layers.Dense(32,activation='selu'))(dense_in)
    dense_out = tfa.layers.WeightNormalization(layers.Dense(8,activation='selu'))(dense_in)
    rnn_input = RNNInputLayer()(input_layer)
    lstm_layer = layers.LSTM(units=16,input_shape=(1,3))(rnn_input)
    lstm_dense = tfa.layers.WeightNormalization(layers.Dense(16,activation='selu'))(lstm_layer)
    finalconcat = layers.Concatenate()([dense_out,lstm_dense])
    final_dense = tfa.layers.WeightNormalization(layers.Dense(16,activation='selu'))(finalconcat)
    output_layer = layers.Dense(3)(final_dense)

    model = CompleteModel(inputs=input_layer,outputs=output_layer,name='lstm_model')
    return model
model = make_model()
model.compile(loss=tf.keras.losses.MeanSquaredLogarithmicError(),optimizer='Adam')



model.fit(train_x,train_labels,epochs=100) #error

正如 Priya 所強調的那樣,錯誤表示 dim[0] 指的是批次維度,因此此答案可能無法解決您的錯誤,但它肯定會幫助您實現模型

在您的make_model()函數中,嘗試將dense_out更改為

dense_out = tfa.layers.WeightNormalization(layers.Dense(8,activation='selu'))(dense_in2)

make_model()函數中, dense_in2層不用作dense_output層的輸入。 您可能想在模型中使用這一層,而忘記將 2 添加到變量名稱的末尾。 此外,我假設dense_output層正在尋找32 個神經元的輸入維度,這可能是錯誤的原因(盡管這可能已被實現WeightNormalization 層對象的方式完全繞過)。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM