简体   繁体   English

Keras:在theano和tensorflow之间转换预训练的权重

[英]Keras: convert pretrained weights between theano and tensorflow

I would like to use this pretrained model . 我想使用这个预训练模型

It is in theano layout, my code depends on tensorflow image dimension ordering. 它是在theano布局中,我的代码依赖于tensorflow图像维度排序。

There is a guide on converting weights between the formats . 有关于在格式之间转换权重的指南。

But this seems broken. 但这似乎破了。 In the section to convert theano to tensorflow, the first instruction is to load the weights into the tensorflow model. 在将theano转换为tensorflow的部分中,第一条指令是将权重加载到张量流模型中。

Keras backend should be TensorFlow in this case. 在这种情况下,Keras后端应该是TensorFlow。 First, load the Theano-trained weights into your TensorFlow model: 首先,将Theano训练的权重加载到TensorFlow模型中:

model.load_weights('my_weights_theano.h5')

This raises an exception, the weight layouts would be incompatible. 这引发了一个例外,重量布局将是不兼容的。 And if the load_weights function would take theano weights for a tensorflow model, there wouldn't be any need to convert them. 如果load_weights函数将采用张量流模型的theano权重,则不需要转换它们。

I took a look at the convert_kernel function to see, whether I could do the necessary steps myself. 我看了一下convert_kernel函数,看看我是否可以自己做必要的步骤。

The code is rather simple - I don't understand why the guide makes use of a tensorflow session. 代码相当简单 - 我不明白为什么指南使用tensorflow会话。 That seems unnecessary. 这似乎没必要。

I've copied the code from the pretrained model to create a model with tensorflow layer. 我已经从预训练模型中复制了代码,以创建具有张量流层的模型。 This just meant changing the input shape and the backend.image_dim_ordering before adding any Convolutions. 这只是意味着在添加任何Convolutions之前更改输入形状和backend.image_dim_ordering Then I used this loop: 然后我用这个循环:

model is the original model, created from the code I linked at the beginning. model是原始模型,是从我在开头链接的代码创建的。 model_tensorflow is the exact same model, but with tensorflow layout. model_tensorflow是完全相同的模型,但具有tensorflow布局。

for i in range(len(model.layers)):
    layer_theano=model.layers[i]
    layer_tensorflow=model_tensorflow.layers[i]

    if layer_theano.__class__.__name__ in ['Convolution1D', 'Convolution2D', 'Convolution3D', 'AtrousConvolution2D']:
        weights_theano=layer_theano.get_weights()

        kernel=weights_theano[0]
        bias=weights_theano[1]

        converted_kernel=convert_kernel(kernel, "th")
        converted_kernel=converted_kernel.transpose((3,2,1,0))

        weights_tensorflow=[converted_kernel, bias]

        layer_tensorflow.set_weights(weights_tensorflow)

    else:
        layer_tensorflow.set_weights(layer_theano.get_weights())

In the original code, there is a testcase: Prediction ran on the image of a cat. 在原始代码中,有一个测试用例:预测在猫的图像上运行。 I've downloaded a cat image and tried the testcase with the original model: 285. The converted model predicts 585. 我已经下载了猫图像并尝试使用原始模型的测试用例:285。转换后的模型预测585。

I don't know whether 285 is the correct label for a cat, but even if it isn't, the two models should be broken in the same way, I would expect the same prediction. 我不知道285是否是猫的正确标签,但即使不是,两个模型应该以相同的方式打破,我期望相同的预测。

What is the correct way of converting weights between models ? 在模型之间转换权重的正确方法是什么?

You are right. 你是对的。 The code is broken. 代码坏了。 As of now, there is a work around for this issue and the solution is described here . 截至目前,有一个解决此问题的方法, 此处描述了解决方案。

I have tested it myself and it worked for me. 我自己测试了它,它对我有用。

If you feel the answer is useful, please upvote it. 如果您觉得答案很有用,请提供帮助。 Thanks! 谢谢!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM