简体   繁体   English

将可训练双射器嵌入 Keras model

[英]Embed trainable bijector into Keras model

I am trying to implement normalizing flows embedded in a Keras model.我正在尝试实现嵌入在 Keras model 中的规范化流。 In all examples I can find, such as the documentation of MAF , the bijectors which constitute the normalizing flows are embedded into a TransformedDistribution and exposed directly for training etc.在我能找到的所有示例中,例如MAF的文档,构成规范化流的双射器被嵌入到TransformedDistribution中并直接公开用于训练等。

I am trying to embed this TransformedDistribution in a keras Model to match the architecture of other models I have which are inheriting from keras Model. I am trying to embed this TransformedDistribution in a keras Model to match the architecture of other models I have which are inheriting from keras Model.

Unfortunately all my attempts (see code) so far fail at transferring the trainable variables inside the transformed distribution to the keras Model.不幸的是,到目前为止,我所有的尝试(参见代码)都未能将转换后的分布中的可训练变量转移到 keras Model。

I have tried to make the bijector inherit from tf.keras.layers.Layer , which did not change anything.我试图让双射器从tf.keras.layers.Layer继承,这并没有改变任何东西。

import tensorflow as tf
import tensorflow_probability as tfp

tfd = tfp.distributions
tfb = tfp.bijectors


class Flow(tfb.Bijector, tf.Module):
    """
    tf.Module to register trainable_variables
    """

    def __init__(self, d, init_sigma=0.1, **kwargs):
        super(Flow, self).__init__(
            dtype=tf.float32,
            forward_min_event_ndims=0,
            inverse_min_event_ndims=0,
            **kwargs
        )
        # Shape of the flow goes from Rd to Rd
        self.d = d
        # Weights/Variables initializer
        self.init_sigma = init_sigma
        w_init = tf.random_normal_initializer(stddev=self.init_sigma)
        # Variables
        self.u = tf.Variable(
            w_init(shape=[1, self.d], dtype=tf.float32),
            dtype=tf.float32,
            name='u',
            trainable=True,
        )

    def _forward(self, x):
        return x

    def _inverse(self, y):
        return y


class Flows(tf.keras.Model):

    def __init__(self, d=2, shape=(100, 2), n_flows=10, ):
        super(Flows, self).__init__()
        # Parameters
        self.d = d
        self.shape = shape
        self.n_flows = n_flows
        # Base distribution - MF = Multivariate normal diag
        base_distribution = tfd.MultivariateNormalDiag(
            loc=tf.zeros(shape=shape, dtype=tf.float32)
        )
        # Flows as chain of bijector
        flows = []
        for n in range(n_flows):
            flows.append(Flow(self.d, name=f"flow_{n + 1}"))
        bijector = tfb.Chain(list(reversed(flows)))
        self.flow = tfd.TransformedDistribution(
            distribution=base_distribution,
            bijector=bijector
        )

    def call(self, *inputs):
        return self.flow.bijector.forward(*inputs)

    def log_prob(self, *inputs):
        return self.flow.log_prob(*inputs)

    def sample(self, num):
        return self.flow.sample(num)


q = Flows()
# Call to instantiate variables
q(tf.zeros(q.shape))
# Prints no trainable params
print(q.summary())
# Prints expected trainable params
print(q.flow.trainable_variables)

Any idea if this is even possible?知道这是否可能吗? Thanks!谢谢!

I bumped into this issue as well.我也碰到过这个问题。 It seems to be caused by the incompatibility issues between TFP and TF 2.0 (a couple relevant issues https://github.com/tensorflow/probability/issues/355 and https://github.com/tensorflow/probability/issues/946 ).这似乎是由 TFP 和 TF 2.0 之间的不兼容问题引起的(几个相关问题https://github.com/tensorflow/probability/issues/355https://github.com/tensorflow/probability/issues )。

As a workaround, you need to add the (trainable) variables of your transformed distribution / bijector as an attribute to your Keras Model:作为一种解决方法,您需要将转换后分布/双射器的(可训练)变量作为属性添加到 Keras Model:

class Flows(tf.keras.Model):

    def __init__(self, d=2, shape=(100, 2), n_flows=10, ):
        super(Flows, self).__init__()
        # Parameters
        self.d = d
        self.shape = shape
        self.n_flows = n_flows
        # Base distribution - MF = Multivariate normal diag
        base_distribution = tfd.MultivariateNormalDiag(
            loc=tf.zeros(shape=shape, dtype=tf.float32)
        )
        # Flows as chain of bijector
        flows = []
        for n in range(n_flows):
            flows.append(Flow(self.d, name=f"flow_{n + 1}"))
        bijector = tfb.Chain(list(reversed(flows)))
        self.flow = tfd.TransformedDistribution(
            distribution=base_distribution,
            bijector=bijector
        )
        # issue: https://github.com/tensorflow/probability/issues/355, https://github.com/tensorflow/probability/issues/946
        # need to add bijector's trainable variables as an attribute (name does not matter)
        # otherwise this layer has zero trainable variables
        self._variables = self.flow.variables # https://github.com/tensorflow/probability/issues/355

    def call(self, *inputs):
        return self.flow.bijector.forward(*inputs)

    def log_prob(self, *inputs):
        return self.flow.log_prob(*inputs)

    def sample(self, num):
        return self.flow.sample(num)

After adding this your model should have trainable variables and weights to optimize.添加后,您的 model 应该有可训练的变量和权重进行优化。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Keras 如何更改已加载模型的可训练层 - Keras how to change trainable layers of a loaded model keras 在预训练模型上设置可训练标志 - keras setting trainable flag on pretrained model 计算keras模型中不可训练的参数params - non trainable parameters params in keras model is calculated keras 模型中的空可训练变量(keras 版本 = 2.2.4-tf) - Empty trainable variable in keras model(keras version = 2.2.4-tf) Tensorflow Keras 对于一个 model 的可训练变量受其他 Z49DDB8F35E630FCC3 的可训练变量影响,渐变磁带返回 None - Tensorflow Keras Gradient Tape returns None for a trainable variable of one model which is impacted by trainable variable of other model 如何在急切执行模式下获取 keras model 的可训练变量? - How to get trainable variables of keras model in eager execution mode? (tf.)keras 加载保存的 model 权重和可训练的词嵌入 - (tf.)keras loading saved model weights with trainable word embeddings 设置Keras模型可训练与使每层可训练之间有什么区别 - What is the difference between setting a Keras model trainable vs making each layer trainable 如何在 Keras 中获取模型的可训练参数的数量? - How can I get the number of trainable parameters of a model in Keras? Keras 中的“model.trainable = False”是什么意思? - What does “model.trainable = False” mean in Keras?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM