简体   繁体   English

创建CoreML LRCN模型

[英]Creating a CoreML LRCN model

Hello and thank you in advance or any help or guidance provided! 您好,在此先感谢您或提供的任何帮助或指导!

The question I have stems from an article posted on Apple's CoreML documentation site. 我的问题来自于Apple CoreML文档站点上发布的一篇文章。 The topic of this article was also covered during the WWDC 2017 lectures and I found it quite interesting. 在WWDC 2017讲座中也涵盖了本文的主题,我发现它很有趣。 I posted a question recently that was related to part of this same project I'm working on and it was solved with ease; 我最近发布了一个问题,该问题与我正在从事的同一项目的一部分有关,并且可以轻松解决。 however, as I get further into this endeavor, I find myself not understanding how part of this model is being implemented. 但是,随着我进一步努力,我发现自己不了解该模型的一部分是如何实现的。

To start off, I have a model I'm building in Keras with a Tensorflow backend that uses convolutional layers in the time distributed wrapper. 首先,我要在Keras中建立一个模型,该模型的Tensorflow后端在时间分布式包装器中使用卷积层。 Following the convolutional section, a single LSTM layer connects to a dense layer as the output. 在卷积部分之后,单个LSTM层连接到密集层作为输出。 The goal is to create a many to many structure that classifies each item in a padded sequence of images. 目标是创建一个多对多结构,以填充图像序列将每个项目分类。 I'll post the code for the model below. 我将在下面发布该模型的代码。

My plan to train and deploy this network may raise other questions down the road, but I will make separate a post if they cause trouble. 我计划培训和部署该网络的计划可能会提出其他问题,但是如果它们引起麻烦,我将在此单独发布。 It relates to training with the time distributed wrapper, then striping it off the model and loading the weights for the wrapped layers at CoreML conversion time as the time distributed wrapper doesn't play well with CoreML. 它涉及到使用时间分布式包装器进行训练,然后将其剥离模型并在CoreML转换时加载已包装层的权重,因为时间分配包装器不适用于CoreML。

My question is this: 我的问题是这样的:

In the aforementioned article (and in a CormeML example project I found on GitHub), the implementation is quite clever. 在上述文章(以及我在GitHub上找到的CormeML示例项目)中,实现非常聪明。 Since CoreML (or at least the stock converter) doesn't support image sequences as inputs, the images are fed one at a time, and the LSTM states are passed out of the network as an output along with the prediction for the input image. 由于CoreML(或至少库存转换器)不支持将图像序列作为输入,因此图像一次被馈入,并且LSTM状态作为输入与输出预测一起从网络传递出去。 For the next image in the sequence, the user passes the image, along with the previous time step's LSTM state so the model can "pick up where it left off" so to speak and handle the single inputs as a sequence. 对于序列中的下一个图像,用户将图像以及上一个时间步的LSTM状态传递给该图像,以便该模型可以“从中断处拾取”,以便将单个输入作为序列进行处理。 It sort of forms a loop for the LSTM state (this is covered in further detail in the Apple article). 它为LSTM状态形成了一个循环(Apple文章对此进行了详细介绍)。 Now, for the actual question part... 现在,对于实际问题部分...

How is this implemented in a library like Keras? 如何在像Keras这样的库中实现这一点? So far I have been successful at outputting the LSTM state using the functional API and the "return_state" setting on the LSTM layer, and routing that to a secondary output. 到目前为止,我已经成功使用功能性API和LSTM层上的“ return_state”设置输出LSTM状态,并将其路由到辅助输出。 Pretty simple. 很简单 Not so simple (at least for me), is how to pass that state back INTO the network for the next prediction. (至少对我而言)不是那么简单,是如何将该状态传回网络以进行下一个预测。 I've looked over the source code and documentation for the LSTM layer and I don't see anything that jumps out as an input for the state. 我已经查看了LSTM层的源代码和文档,但看不到任何作为状态输入而跳出的内容。 The only thing I can think of, is to possibly make the LSTM layer its own model and use the "initial_state" to set it, but based upon a post on the Keras GitHub I found, it seems like the model then needs a custom call function and I'm not sure how to work that into CoreML. 我唯一能想到的就是可能使LSTM层成为其自己的模型,并使用“ initial_state”进行设置,但是基于我发现的Keras GitHub上的一篇文章,看来该模型需要自定义调用功能,我不确定如何在CoreML中使用它。 Just FYI, I am planning to loop both the hidden and cell states in and out of the model, unless that isn't necessary and only the hidden states should be used as is shown in Apple's model. 仅供参考,我打算将隐藏状态和单元状态都循环到模型中或从模型中循环出来,除非这不是必需的,并且仅应使用隐藏状态,如Apple模型中所示。

Thanks once again. 再次感谢。 Any help provided is always appreciated! 提供的任何帮助总是感激不尽!

My current model looks like this: 我当前的模型如下所示:

image_input = Input(shape=(max_sequence_length, 224, 224, 3))
hidden_state_input = Input(shape=((None, 256)))
cell_state_input = Input(shape=((None, 256)))

convolutional_1 = TimeDistributed(Conv2D(64, (3, 3), activation='relu', data_format = 'channels_last'))(image_input)
pooling_1 = TimeDistributed(MaxPooling2D((2, 2), strides=(1, 1)(convolutional_1)

convolutional_2 = TimeDistributed(Conv2D(128, (4,4), activation='relu'))(pooling_1)
pooling_2 = TimeDistributed(MaxPooling2D((2, 2), strides=(2, 2)))(convolutional_2)

convolutional_3 = TimeDistributed(Conv2D(256, (4,4), activation='relu'))(pooling_2)
pooling_3 = TimeDistributed(MaxPooling2D((2, 2), strides=(2, 2)))(convolutional_3)

flatten_1 = TimeDistributed(Flatten())(pooling_3)
dropout_1 = TimeDistributed(Dropout(0.5))(flatten_1)

lstm_1, state_h, state_c = LSTM(256, return_sequences=True, return_state=True, stateful=False, dropout=0.5)(dropout_1)

dense_1 = TimeDistributed(Dense(num_classes, activation='sigmoid'))(lstm_1)

model = Model(inputs = [image_input, hidden_state_input, cell_state_input], outputs = [dense_1, state_h, state_c])

Link to Apple article: https://developer.apple.com/documentation/coreml/core_ml_api/making_predictions_with_a_sequence_of_inputs 链接到Apple文章: https : //developer.apple.com/documentation/coreml/core_ml_api/making_predictions_with_a_sequence_of_inputs

Link to GitHub repo with an example model that uses a similar method: https://github.com/akimach/GestureAI-CoreML-iOS 使用使用类似方法的示例模型链接到GitHub存储库: https : //github.com/akimach/GestureAI-CoreML-iOS

Link to the Keras GitHub post about the custom call function: https://github.com/keras-team/keras/issues/2995 链接到有关自定义调用函数的Keras GitHub帖子: https : //github.com/keras-team/keras/issues/2995

It turns out the coremltools converter will automatically add the state inputs and outputs during conversion. 事实证明coremltools转换器将在转换过程中自动添加状态输入和输出。

Keras converter _topology.py, line 215 for reference. Keras转换器_topology.py,第215行供参考。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM