如何使用 CNN-LSTM 对不同长度的视频进行分类？

Question

I'm trying to classify videos into 10 categories.我正在尝试将视频分为 10 类。 I have this model so far:到目前为止我有这个模型：

model=keras.models.Sequential()
model.add(keras.layers.TimeDistributed(keras.layers.Conv2D(filters=3,kernel_size=(5, 5),activation='relu'),input_shape=(None,90,90,3)))
model.add(keras.layers.TimeDistributed(keras.layers.Flatten()))
model.add(keras.layers.LSTM(20,activation='relu'))
model.add(keras.layers.Dense(10, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
          optimizer=keras.optimizers.Adam(),
          metrics=['accuracy'])

The problem is that each video has different number of frames where each frame is 90x90 3 channels.问题是每个视频都有不同的帧数，其中每帧是 90x90 3 个通道。 I have observed that first input in input_shape represent the number of frames in each video but I have different amount of frames in each video.我观察到input_shape中的第一个输入表示每个视频中的帧数，但每个视频中的帧数不同。 How can I train this model on these videos?我如何在这些视频上训练这个模型？ I have videos loaded in following format: [[image1ofvideo1,image2ofvideo1],[image1ofvideo2,image2ofvideo2,image3ofvideo2],[image1ofvideo3]].我有以下格式加载的视频：[[image1ofvideo1,image2ofvideo1],[image1ofvideo2,image2ofvideo2,image3ofvideo2],[image1ofvideo3]]。 If I try to train like this, I get error.如果我尝试这样训练，就会出错。 Also numpy doesn't support varying length. numpy 也不支持变长。 I would also like to avoid adding black frames to make equal length videos.我还想避免添加黑框来制作等长的视频。

Answer 1

You can use a data generator and dynamically feed frames without saving the data as a numpy array before hand so you can create data on the fly.您可以使用数据生成器并动态提供帧，而无需事先将数据保存为 numpy 数组，以便您可以即时创建数据。 This is a great article on how you can implement a custom data generator 这是一篇关于如何实现自定义数据生成器的好文章

如何使用 CNN-LSTM 对不同长度的视频进行分类？

问题描述

1 个解决方案

解决方案1
0 2019-12-08 17:53:56

如何使用 CNN-LSTM 对不同长度的视频进行分类？

问题描述

1 个解决方案

解决方案1 0 2019-12-08 17:53:56

解决方案1
0 2019-12-08 17:53:56