简体繁体 English

什么样的数据存储在预训练模型中，例如caffe模型动物园？

[英]What kind of data stored in pre-trained model, such as caffe model zoo?

原文 2016-11-29 20:33:39 7 1 machine-learning/ computer-vision

I came across this question from reading squeeze net paper. 我从阅读挤网纸时遇到了这个问题。 The authors state that they use Deep Compression to compress the pre-trained model. 作者指出，他们使用深度压缩来压缩预先训练的模型。 The algorithm includes Huffman Code etc. 该算法包括霍夫曼代码等。

I infer the pre-trained are all parameters and I do know these parameters are generated when training the network but I have no idea how the parameters are generated. 我推断出所有参数都是经过预训练的，我确实知道在训练网络时会生成这些参数，但是我不知道这些参数是如何生成的。 What role do parameters of the pre-trained model play when doing prediction? 在进行预测时，预训练模型的参数起什么作用？

It sounds to me like black magic 对我来说听起来像黑魔法

1 个解决方案

The pre-trained model consists of the weights for all of the layer connections to/from every kernel of every layer. 预先训练的模型由权重全部来自每一层的每一个内核层连接/的。 That's the "heavy lifting" from the first 40-80 epochs of training. 从最初的40-80个训练时期开始，这就是“繁重的工作”。 It should be ready to do predictions, or continue with whatever fine-tuning you'd care to apply. 它应该准备好进行预测，或者继续进行您希望应用的任何微调。

It's not really black magic. 这不是真正的黑魔法。 Each framework has a facility to dump (back-up) the parameter values at specified intervals and at completion of training. 每个框架都有一个设施，可以在指定的时间间隔和培训完成时转储（备份）参数值。 Granted, these are relatively large files -- hence the use of compression. 当然，这些都是相对较大的文件-因此使用了压缩。 Each framework has a facility to read in such a dump file in order to bootstrap a model. 每个框架都有读取此类转储文件的工具，以引导模型。