简体   繁体   English

Keras:渐变问题,自定义层在顺序 model 中不起作用

[英]Keras : problem with gradients, custom layer won't work in a sequential model

This is the error message I get with the code down below:这是我在下面的代码中收到的错误消息:
ValueError: No gradients provided for any variable: ['Variable:0']. ValueError:没有为任何变量提供渐变:['Variable:0']。
right after it goes through the whole layer's build(), in model.fit().在它通过整个层的 build() 之后,在 model.fit() 中。

It prints the input and the scalar after going through build() and before raising the error, but the tensors are both empty:它在通过 build() 和引发错误之前打印输入和标量,但张量都是空的:

Tensor("IteratorGetNext:0", shape=(None, 1), dtype=float32)  
<tf.Variable 'Variable:0' shape=(1,) dtype=float>

My goal was to write a (basic) custom layer and to insert it in a (basic) model.我的目标是编写一个(基本)自定义层并将其插入(基本)model。 My custom layer works properly on its own but I can't get the model to fit.我的自定义层自行正常工作,但我无法让 model 适合。 The layer take a tensor and multiply it by a scalar.该层采用张量并将其乘以标量。 I want my model to give me input*(scalar I chose early on).我希望我的 model 给我输入*(我早期选择的标量)。

Thus far I've gotten plenty of Error Warning about the dtype of various tensors (I had int32 instead of float32) so I wrote plenty of casts, and I had a model more complex but I stripped it to the bones to debug (it didn't help much…).到目前为止,我已经收到了很多关于各种张量的 dtype 的错误警告(我有 int32 而不是 float32)所以我写了很多演员表,我有一个 model 更复杂,但我把它剥离到骨头来调试(它没有帮不上什么忙……)。

I tried with and without a "build()", with and without using "to_categorical" on the labels, with vector inputs and scalar inputs, and other probably insignificant stuff.我尝试使用和不使用“build()”,在标签上使用和不使用“to_categorical”,使用矢量输入和标量输入,以及其他可能无关紧要的东西。

Here is the code of the layer:这是图层的代码:

from tensorflow.python.keras import layers
import tensorflow as tf
from tensorflow.python.ops import math_ops
from tensorflow.python.framework import tensor_shape
import h5py
import numpy as np


class MyBasicLayer(layers.Layer):
    def __init__(self, **kwargs):
        super().__init__(self)
        self._set_dtype_policy('float32')
        self.w = self.add_weight(shape=(1,), initializer='zeros', trainable=True)

    def build(self, input_shape):
        input_shape = tensor_shape.TensorShape(input_shape)
        if tensor_shape.dimension_value(input_shape[-1]) is None:
            raise ValueError('The last dimension of the inputs to `MyBasicLayer` should be defined. Found `None`.')
        super().build(input_shape)

    def call(self, inputs):
        print(inputs)
        print(self.w)
        return tf.math.multiply(tf.dtypes.cast(inputs,dtype='float32'),self.w)

And here is the code of the model:这是 model 的代码:

import numpy as np
import tensorflow as tf
import os
from tensorflow.keras import Sequential
from my_basic_layer import MyBasicLayer
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.utils import to_categorical
from tensorflow.python.keras.layers import Activation
from tensorflow.keras import activations



k = 2.

# load the dataset
inset = np.array([[i] for i in range(40)], dtype='float32')
outset = inset * k
#outset = to_categorical(outset, num_classes =256)

# define the model
model = Sequential()
model.add(MyBasicLayer(input_shape=(1,))) #input_shape=(4,)
#model.add(Activation(activations.softmax))

# compile the model
model.compile()

# fit the model
model.fit(inset, outset)
model.summary()

Maybe relevant for all I know:也许与我所知道的一切有关:
I wanted to have a model.summary() before the compilation but I got我想在编译之前有一个 model.summary() 但我得到了
This model has not yet been built.此 model 尚未构建。 Build the model first by calling build() or calling fit() with some data, or specify an input_shape argument in the first layer(s) for automatic build.首先通过调用build()或使用一些数据调用fit()来构建 model,或者在第一层中指定input_shape参数以进行自动构建。
even after adding el famoso input_shape argument in the first layer.即使在第一层添加了el famoso input_shape参数。

Thank you谢谢

Specifying the Solution here (Answer Section) even though it is present in the Comments Sections, for the benefit of the Community .为了社区的利益,在此处指定解决方案(回答部分),即使它出现在评论部分。

The error, ValueError: No gradients provided for any variable: ['Variable:0'].错误, ValueError: No gradients provided for any variable: ['Variable:0']. in the above case is because No Loss Function was provided when the Model is Compiled .在上述情况下,是因为在Compiled Model时提供了 No Loss Function

So, replacing所以,更换

model.compile()

with

model.compile(loss='categorical_crossentropy')

will fix the error.将修复错误。

For the sake of completeness, the Simple working example code which uses Custom Layer is shown below:为了完整起见,使用Custom Layer的简单工作示例代码如下所示:

from tensorflow.python.keras import layers
import tensorflow as tf
from tensorflow.python.ops import math_ops
from tensorflow.python.framework import tensor_shape
import h5py
import numpy as np


class MyBasicLayer(layers.Layer):
    def __init__(self, **kwargs):
        super().__init__(self)
        self._set_dtype_policy('float32')
        self.w = self.add_weight(shape=(1,), initializer='zeros', trainable=True)

    def build(self, input_shape):
        input_shape = tensor_shape.TensorShape(input_shape)
        if tensor_shape.dimension_value(input_shape[-1]) is None:
            raise ValueError('The last dimension of the inputs to `MyBasicLayer` should be defined. Found `None`.')
        super().build(input_shape)

    def call(self, inputs):
        print(inputs)
        print(self.w)
        return tf.math.multiply(tf.dtypes.cast(inputs,dtype='float32'),self.w)

import numpy as np
import tensorflow as tf
import os
from tensorflow.keras import Sequential
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.utils import to_categorical
from tensorflow.python.keras.layers import Activation
from tensorflow.keras import activations



k = 2.

# load the dataset
inset = np.array([[i] for i in range(40)], dtype='float32')
outset = inset * k
#outset = to_categorical(outset, num_classes =256)

# define the model
model = Sequential()
model.add(MyBasicLayer(input_shape=(1,))) #input_shape=(4,)

# compile the model
model.compile(loss='categorical_crossentropy')

# fit the model
model.fit(inset, outset)
model.summary()

Output of the above code is shown below:上述代码的Output如下所示:

Tensor("IteratorGetNext:0", shape=(None, 1), dtype=float32)
<tf.Variable 'Variable:0' shape=(1,) dtype=float32>
Tensor("IteratorGetNext:0", shape=(None, 1), dtype=float32)
<tf.Variable 'Variable:0' shape=(1,) dtype=float32>
2/2 [==============================] - 0s 2ms/step - loss: nan
Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
my_basic_layer_3 (MyBasicLay multiple                  1         
=================================================================
Total params: 1
Trainable params: 1
Non-trainable params: 0

Hope this helps.希望这可以帮助。 Happy Learning!快乐学习!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM