简体   繁体   English

深度学习 - 关于咖啡的一些天真的问题

[英]deep learning - a number of naive questions about caffe

I am trying to understand the basics of caffe, in particular to use with python. 我试图了解caffe的基础知识,特别是与python一起使用。

My understanding is that the model definition (say a given neural net architecture) must be included in the '.prototxt' file. 我的理解是模型定义(比如给定的神经网络架构)必须包含在'.prototxt'文件中。

And that when you train the model on data using the '.prototxt' , you save the weights/model parameters to a '.caffemodel' file 当您使用'.prototxt'在数据上训练模型时,您将权重/模型参数保存为'.caffemodel'文件

Also, there is a difference between the '.prototxt' file used for training (which includes learning rate and regularization parameters) and the one used for testing/deployment, which does not include them. 此外,用于培训的'.prototxt'文件(包括学习率和正则化参数)与用于测试/部署的文件(不包括它们)之间存在差异。

Questions: 问题:

  1. is it correct that the '.prototxt' is the basis for training and that the '.caffemodel' is the result of training (weights), using the '.prototxt' on the training data? 是否正确'.prototxt'是训练的基础, '.caffemodel'是训练(权重)的结果,使用训练数据上的'.prototxt'
  2. is it correct that there is a '.prototxt' for training and one for testing, and that there are only slight differences (learning rate and regularization factors on training), but that the nn architecture (assuming you use neural nets) is the same? 是否正确,有一个'.prototxt'用于训练,一个用于测试,并且只有轻微的差异(学习率和训练的正则化因素),但nn架构(假设你使用神经网络)是相同的?

Apologies for such basic questions and possibly some very incorrect assumptions, I am doing some online research and the lines above summarize my understanding to date. 对于这些基本问题以及可能是一些非常不正确的假设表示道歉,我正在做一些在线研究,上面的几行总结了我迄今为止的理解。

Let's take a look at one of the examples provided with BVLC/caffe: bvlc_reference_caffenet . 我们来看看BVLC / caffe提供的一个例子: bvlc_reference_caffenet
You'll notice that in fact there are 3 '.prototxt' files: 您会注意到实际上有3个 '.prototxt'文件:

The net architecture represented by train_val.prototxt and deploy.prototxt should be mostly similar. train_val.prototxtdeploy.prototxt表示的网络体系结构应该大致相似。 There are few main difference between the two: 这两者之间几乎没有什么区别:

  • Input data: during training one usually use a predefined set of inputs for training/validation. 输入数据:在训练期间,通常使用一组预定义的输入进行训练/验证。 Therefore, train_val usually contains an explicit input layer, eg, "HDF5Data" layer or a "Data" layer. 因此, train_val通常包含显式输入层,例如"HDF5Data"层或"Data"层。 On the other hand, deploy usually does not know in advance what inputs it will get, it only contains a statement: 另一方面, deploy通常不会事先知道它将获得什么输入,它只包含一个声明:

     input: "data" input_shape { dim: 10 dim: 3 dim: 227 dim: 227 } 

    that declares what input the net expects and what should be its dimensions. 它声明了网络所期望的输入以及它的维度。
    Alternatively, One can put an "Input" layer: 或者,可以放一个"Input"图层:

     layer { name: "input" type: "Input" top: "data" input_param { shape { dim: 10 dim: 3 dim: 227 dim: 227 } } } 
  • Input labels: during training we supply the net with the "ground truth" expected outputs, this information is obviously not available during deploy . 输入标签:在培训期间,我们为网络提供“地面实况”预期输出,这些信息显然在deploy期间不可用。
  • Loss layers: during training one must define a loss layer. 损失层:在训练期间,必须定义损失层。 This layer tells the solver in what direction it should tune the parameters at each iteration. 该层告诉求解器它应该在每次迭代时调整参数的方向。 This loss compares the net's current prediction to the expected "ground truth". 这种损失将网络的当前预测与预期的“基本事实”进行了比较。 The gradient of the loss is back-propagated to the rest of the net and this is what drives the learning process. 损失的梯度反向传播到网络的其余部分,这是推动学习过程的原因。 During deploy there is no loss and no back-propagation. deploy期间,没有丢失也没有反向传播。

In caffe, you supply a train_val.prototxt describing the net, the train/val datasets and the loss. 在caffe中,您提供了train_val.prototxt用于描述网络,火车/瓦尔数据集和损失。 In addition, you supply a solver.prototxt describing the meta parameters for training. 此外,您还提供了一个solver.prototxt用于描述训练的元参数。 The output of the training process is a .caffemodel binary file containing the trained parameters of the net. 训练过程的输出是一个.caffemodel二进制文件,其中包含网络的训练参数。
Once the net was trained, you can use the deploy.prototxt with the .caffemodel parameters to predict outputs for new and unseen inputs. 一旦网络被训练,你可以使用deploy.prototxt.caffemodel参数来预测新的和看不见的输入输出。

Yes but, there is diffrent types of .prototxt files for example 是的但是,例如,有不同类型的.prototxt文件

https://github.com/BVLC/caffe/blob/master/examples/mnist/lenet_train_test.prototxt https://github.com/BVLC/caffe/blob/master/examples/mnist/lenet_train_test.prototxt

this is for the training and testing network 这是用于培训和测试网络

for commandline training ypu can use a solver file which is also .prototxt file for example 对于命令行训练,ypu可以使用求解器文件,例如.prototxt文件

https://github.com/BVLC/caffe/blob/master/examples/mnist/lenet_solver.prototxt https://github.com/BVLC/caffe/blob/master/examples/mnist/lenet_solver.prototxt

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM