簡體   English   中英

如何預處理Conv3D模型的視頻

[英]How to preprocess videos for a Conv3D Model

我在Keras中有這個Conv3D模型:

model = Sequential(

Conv3D(32, (3,3,3), activation='relu', input_shape=self.input_shape),
MaxPooling3D(pool_size=(1, 2, 2), strides=(1, 2, 2)),
Conv3D(64, (3,3,3), activation='relu'),
MaxPooling3D(pool_size=(1, 2, 2), strides=(1, 2, 2)),
Conv3D(128, (3,3,3), activation='relu'),
Conv3D(128, (3,3,3), activation='relu'),
MaxPooling3D(pool_size=(1, 2, 2), strides=(1, 2, 2)),
Conv3D(256, (2,2,2), activation='relu'),
Conv3D(256, (2,2,2), activation='relu'),
MaxPooling3D(pool_size=(1, 2, 2), strides=(1, 2, 2)),

Flatten(),
Dense(1024)),
Dropout(0.5),
Dense(1024),
Dropout(0.5)),
Dense(self.nb_classes, activation='softmax')
)

該模型基於本文https://arxiv.org/pdf/1412.0767.pdf

使用Conv3D預處理要預測的視頻數據的最佳方法是哪種?

我編寫了此函數以從UCF-101的每個視頻中提取幀:

def frame_writer(pathIn, pathOut, class_name):
"""
This function will read videos and write frames in a new dataset
args:
    pathIn -> base dataset of videos
    pathOut -> destination folder for the frames ('data/path')
"""
#creating output path if it not exists
try:
  if not os.path.exists(pathOut + '/' + class_name):
    os.makedirs(pathOut + '/' + class_name)

  else:
    pass
except:
  print('Invalid path!')

#getting the list containing all files from the directory
pathIn_files = glob.glob(pathIn + '\\' + class_name + '\\' + '*.avi')
video_limit = len(pathIn_files)

#iterating over all files
for i, j in zip(pathIn_files, range(len(pathIn_files))):
  #getting the names from file paths
    base_name = os.path.basename(pathIn_files[j])
    file_name = base_name[0:-4] #taking only the file name (without extension)

    #getting the frames
    vidcap = cv2.VideoCapture(i)
    success,image = vidcap.read()
    count = 0
    success = True
    while success:
      success,image = vidcap.read()
      print ('Read a new frame: ', success)
      cv2.imwrite(pathOut + '\\' + class_name + "\\%s_frame%d.jpg" % (file_name, count), image)
      count += 1
print('Done!')

現在我有了這樣的幀數據集:

資料夾:資料

-子文件夾:火車

-子文件夾:class1

--- frame1_video1_class1.jpg

--- frame2_video1_class1.jpg

--- frame3_video1_class1.jpg

...

--- frameN_videoN_class1.jpg

--SUBFOLDER:class2

--- frame1_video1_class2.jpg

--- frame2_vide1_class2.jpg

--- frame3_video1_class2.jpg

...

--- frameN_videoN_class2.jpg

-SUBFOLDER:測試

-子文件夾:class1

--- frame1_video1_class1.jpg

--- frame2_video1_class1.jpg

--- frame3_video1_class1.jpg

...

--- frameN_videoN_class1.jpg

-子文件夾:class2

--- frame1_video1_class2.jpg

--- frame2_video1_class2.jpg

--- frame3_video1_class2.jpg

...

--- frameN_videoN_class2.jpg

因此,我將所有視頻中的所有幀都放在與它的類相對應的文件夾中。

我必須使用來自keras函數的ImageDataGenerator將其傳遞給Conv3D模型嗎?

那么,在這種情況下,一次傳遞每個班級的每個視頻的每一幀嗎?

還是我必須以其他方式做到這一點?

我只需要使用此模型來預測視頻!

感謝您的支持!

一種方法是將所有框架放入一個大張量,相應地標記它們,然后將其用作Keras模型的輸入。 張量中的幀數將是您的批量大小。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM