简体   繁体   English

Caffe中带有重量的自动编码器

[英]Auto-encoders with tied weights in Caffe

From my understanding, normally an auto-encoder uses tied weights in the encoding and decoding networks right? 根据我的理解,通常自动编码器在编码和解码网络中使用绑定权重吗?

I took a look at Caffe's auto-encoder example , but I didn't see how the weights are tied. 我看了一下Caffe的自动编码器示例 ,但我没看到权重是如何绑定的。 I noticed that the encoding and decoding networks share the same blobs, but how is it guaranteed that the weights are updated correctly? 我注意到编码和解码网络共享相同的blob,但是如何保证权重正确更新?

How to implement tied weights auto-encoders in Caffe? 如何在Caffe中实现捆绑重量自动编码器?

While there's a history of using tied weights in auto-encoders, now days it is rarely used (to the best of my knowledge), which I believe is why this Caffe example doesn't use tied weights. 虽然在自动编码器中有使用绑定权重的历史,但现在很少使用(据我所知),我相信这就是为什么这个Caffe示例不使用绑定权重。

Nevertheless, Caffe does support auto-encoders with tied weights, and it is possilbe using two features: parameter sharing between layers and the transpose flag of the fully-connected layer (InnerProduct in Caffe). 尽管如此,Caffe 确实支持具有绑定权重的自动编码器,并且可以使用两个功能:层之间的参数共享和完全连接层的转置标志 (Caffe中的InnerProduct)。 More specifically, two parameters are shared in Caffe if their name is the same, which can be specified under the param field like so: 更具体地说,如果Caffe的名称相同,则可以在Caffe中共享两个参数,这些参数可以在param字段下指定,如下所示:

layer {
  name: "encode1"
  type: "InnerProduct"
  bottom: "data"
  top: "encode1"
  param {
    name: "encode1_matrix"
    lr_mult: 1
    decay_mult: 1
  }
  param {
    name: "encode1_bias"
    lr_mult: 1
    decay_mult: 0
  }
  inner_product_param {
    num_output: 128
    weight_filler {
      type: "gaussian"
      std: 1
      sparse: 15
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}

If another fully-connected layer (with matching dimensions) used the names "encode1_matrix" and "encode1_bias" then these parameters will always be the same, and Caffe will take care of aggregating gradients and updating the parameters correctly. 如果另一个完全连接的层(具有匹配的维度)使用名称“encode1_matrix”和“encode1_bias”,则这些参数将始终相同,并且Caffe将负责聚合渐变并正确更新参数。 The second part is using the transpose flag of the fully-connected layer, so that the shared matrix is transposed before multiplication of its input. 第二部分是使用完全连接层的转置标志,以便在输入乘法之前转置共享矩阵。 So, extending the above example, if we wanted to have a fully-connected layer with the same weight matrix as "encode1_matrix" as part of the decoding process, then we will define it like so: 因此,扩展上面的例子,如果我们想要一个与“encode1_matrix”具有相同权重矩阵的完全连接层作为解码过程的一部分,那么我们将如下定义它:

layer {
  name: "decode1"
  type: "InnerProduct"
  bottom: "encode1"
  top: "decode1"
  param {
    name: "encode1_matrix"
    lr_mult: 1
    decay_mult: 1
  }
  param {
    name: "decode1_bias"
    lr_mult: 1
    decay_mult: 0
  }
  inner_product_param {
    num_output: 784
    transpose: true
    weight_filler {
      type: "gaussian"
      std: 1
      sparse: 15
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}

Notice that the bias parameters are not shared (cannot be due to different output dimensions), while the matrices are shared and the decoder layer uses the transpose flag which completes the tied auto-encoder architecture. 请注意,偏差参数不是共享的(不能由于不同的输出维度),而矩阵是共享的,解码器层使用转置标志来完成绑定的自动编码器架构。

See here for a complete working example of a tied auto-encoder using Caffe: https://gist.github.com/orsharir/beb479d9ad5d8e389800c47c9ec42840 有关使用Caffe的绑定自动编码器的完整工作示例,请参见此处: https//gist.github.com/orsharir/beb479d9ad5d8e389800c47c9ec42840

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM