简体   繁体   English

在python中构建自定义Caffe层

[英]Building custom Caffe layer in python

After parsing many links regarding building Caffe layers in Python i still have difficulties in understanding few concepts. 解析了许多关于在Python中构建Caffe图层的链接后,我仍然难以理解一些概念。 Can please someone clarify them? 有人可以澄清一下吗?

What I am still missing is: 我仍然缺少的是:

  1. setup() method: what I should do here? setup()方法:我应该在这做什么? Why in example I should compare the lenght of 'bottom' param with '2'? 为什么在例子中我应该将'底部'参数的长度与'2'进行比较? Why it should be 2? 为什么它应该是2? It seems not a batch size because its arbitrary? 它似乎不是批量大小,因为它是随意的? And bottom as I understand is blob, and then the first dimension is batch size? 据我所知,底部是blob,然后第一个维度是批量大小?
  2. reshape() method: as I understand 'bottom' input param is blob of below layer, and 'top' param is blob of upper layer, and I need to reshape top layer according to output shape of my calculations with forward pass. reshape()方法:据我所知'底部'输入参数是下层的blob,'top'参数是上层的blob,我需要根据我的计算的输出形状使用正向传递重塑顶层。 But why do I need to do this every forward pass if these shapes do not change from pass to pass, only weights change? 但是,如果这些形状不是从传递变为传球,只有重量变化,为什么我需要在每个前进传球中执行此操作?
  3. reshape and forward methods have 0 indexes for 'top' input param used. reshapeforward方法使用了'top'输入参数的0个索引。 Why would I need to use top[0].data=... or top[0].input=... instead of top.data=... and top.input=... ? 为什么我需要使用top[0].data=...top[0].input=...而不是top.data=...top.input=... Whats this index about? 这个指数怎么样? If we do not use other part of this top list, why it is exposed in this way? 如果我们不使用此顶级列表的其他部分,为什么以这种方式暴露它? I can suspect its or C++ backbone coincidence, but it would be good to know exactly. 我可以怀疑它或C ++骨干的巧合,但确切地知道它会很好。
  4. reshape() method, line with: reshape()方法,行:

     if bottom[0].count != bottom[1].count 

    what I do here? 我在这做什么 why its dimension is 2 again? 为什么它的尺寸又是2? And what I am counting here? 我在这里算什么? Why both part of blobs (0 and 1) should be equal in amount of some members ( count )? 为什么blob(0和1)的两个部分在某些成员( count )的数量上应该相等?

  5. forward() method, what I define by this line: forward()方法,我在这行中定义的:

     self.diff[...] = bottom[0].data - bottom[1].data 

    When it is used after forward path if I define it? 如果我定义它,在前进路径后使用它? Can we just use 我们可以使用吗?

     diff = bottom[0].data - bottom[1].data 

    instead to count loss later in this method, without assigning to self , or its done with some purpose? 相反,在这个方法中稍后计算损失,而不是分配给self ,或者它是否有某种目的?

  6. backward() method: what's this about: for i in range(2): ? backward()方法:这是关于什么的: for i in range(2): :? Why again range is 2? 为什么范围又是2?

  7. backward() method, propagate_down parameter: why it is defined? backward()方法, propagate_down参数:为什么定义? I mean if its True, gradient should be assigned to bottom[X].diff as I see, but why would someone call method which would do nothing with propagate_down = False , if it just do nothing and still cycling inside? 我的意思是如果它的True,渐变应该被分配给我的bottom[X].diff ,但是为什么有人会调用一些方法,如果它什么都不做并且仍然在里面循环,它会对propagate_down = False无效。

I'm sorry if those questions are too obvious, I just wasn't able to find a good guide to understand them and asking for help here. 对不起,如果这些问题太明显了,我就无法找到一个好的指南来理解它们并在这里寻求帮助。

You asked a lot of questions here, I'll give you some highlights and pointers that I hope will clarify matters for you. 你在这里问了很多问题,我会给你一些亮点和指示,希望能为你澄清一些问题。 I will not explicitly answer all your questions. 我不会明确回答你的所有问题。

It seems like you are most confused about the the difference between a blob and a layer's input/output. 看起来你对blob和图层的输入/输出之间的区别最为困惑。 Indeed most of the layers has a single blob as input and a single blob as output, but it is not always the case. 事实上,大多数的层具有单个斑点作为输入和单个斑点作为输出,但它并非总是如此。 Consider a loss layer: it has two inputs: predictions and ground truth labels. 考虑一个损失层:它有两个输入:预测和地面实况标签。 So, in this case bottom is a vector of length 2 (!) with bottom[0] being a (4-D) blob representing predictions, while bottom[1] is another blob with the labels. 因此,在这种情况下, bottom是长度为2 (!)的向量, bottom[0]是表示预测的(4-D)斑点,而bottom[1]是带有标签的另一个斑点。 Thus, when constructing such a layer you must ascertain that you have exactly (hard coded) 2 input blobs (see eg, ExactNumBottomBlobs() in AccuracyLayer definition). 因此,在构建这样的图层时,您必须确定您具有完全(硬编码)的2个输入blob(例如,参见AccuracyLayer定义中的ExactNumBottomBlobs() )。

The same goes for top blobs as well: indeed in most cases there is a single top for each layer, but it's not always the case (see eg, AccuracyLayer ). 对于top blob也是如此:在大多数情况下,每层都有一个top ,但并非总是如此(参见例如AccuracyLayer )。 Therefore, top is also a vector of 4-D blobs, one for each top of the layer. 因此, top也是4-D blob的向量 ,每个top一个blob。 Most of the time there would be a single element in that vector, but sometimes you might find more than one. 大多数情况下,该向量中只有一个元素,但有时您可能会找到多个元素。

I believe this covers your questions 1,3,4 and 6. 我相信这涵盖了您的问题1,3,4和6。

As of reshape() (Q.2) this function is not called every forward pass, it is called only when net is setup to allocate space for inputs/outputs and params. reshape() (Q.2)开始,每次正向传递都不会调用此函数,只有在设置net以为输入/输出和参数分配空间时才会调用此函数。
Occasionally, you might want to change input size for your net (eg, for detection nets) then you need to call reshape() for all layers of the net to accommodate the new input size. 有时,您可能希望更改网络的输入大小(例如,对于检测网络),然后您需要为网络的所有层调用reshape()以适应新的输入大小。

As for propagate_down parameter (Q.7): since a layer may have more than one bottom you would need, in principle, to pass the gradient to all bottom s during backprop. 至于propagate_down参数(Q.7):由于一个图层可能有一个以上的bottom ,原则上你需要在backprop期间将渐变传递给所有 bottom However, what is the meaning of a gradient to the label bottom of a loss layer? 但是,损耗层label底部的渐变是什么意思? There are cases when you do not want to propagate to all bottom s: this is what this flag is for. 有些情况下,您不希望传播到所有 bottom :这是此标志的用途。 (here's an example with a loss layer with three bottom s that expect gradient to all of them). (这是一个带有三个bottom s的损失层的示例 ,期望所有这些都具有渐变)。

For more information, see this "Python" layer tutorial . 有关更多信息,请参阅"Python"图层教程

Why it should be 2? 为什么它应该是2?

That specific gist is talking about the Euclidian loss layer. 那个具体的要点是谈论欧几里德损失层。 Euclidian loss is the mean square error between 2 vectors. 欧几里德损失是2个向量之间的均方误差。 Hence there must be 2 vectors in the input blob to this layer. 因此,输入blob中必须有2个向量到此层。 The length of each vector must be same because it is element-wise difference. 每个向量的长度必须相同,因为它是元素差异。 You can see this check in the reshape method. 您可以在重塑方法中看到此检查。

Thanks. 谢谢。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM