简体   繁体   English

python中视频帧的多处理

[英]multiprocessing of video frames in python

I am new to multiprocessing in python. 我是python中多处理的新手。 I want to extract features from each frame of hour long video files. 我想从每小时长的视频文件中提取功能。 Processing each frame takes on the order of 30 ms. 处理每个帧大约需要30毫秒。 I thought multiprocessing was a good idea because each frame is processed independentle of all other frames. 我认为多处理是一个好主意,因为每个帧都是独立于所有其他帧处理的。

I want to store the results of the feature extraction in a custom class. 我想将特征提取的结果存储在自定义类中。

I read a few examples and ended up using multiprocessing and Queues as suggested here . 我读了几个例子,最后使用了这里建议的多处理和队列。 The result was disappointing though, now each frames takes about 1000 ms to process. 结果令人失望,现在每个帧需要大约1000毫秒来处理。 I am guessing I produced a ton of overhead. 我猜我产生了大量的开销。

is there a more efficient way to process the frames in parallel and collect the results? 是否有更有效的方法来并行处理帧并收集结果?

to illustrate, I put together a dummy example. 为了说明,我把一个虚拟的例子放在一起。

import multiprocessing as mp
from multiprocessing import Process, Queue
import numpy as np
import cv2

def main():
    #path='path\to\some\video.avi'
    coordinates=np.random.random((1000,2))
    #video = cv2.VideoCapture(path)
    listOf_FuncAndArgLists=[]

    for i in range(50):
        #video.set(cv2.CAP_PROP_POS_FRAMES,i)
        #img_frame_original = video.read()[1]
        #img_frame_original=cv2.cvtColor(img_frame_original, cv2.COLOR_BGR2GRAY)
        img_frame_dummy=np.random.random((300,300)) #using dummy image for this example
        frame_coordinates=coordinates[i,:]
        listOf_FuncAndArgLists.append([parallel_function,frame_coordinates,i,img_frame_dummy])

    queues=[Queue() for fff in listOf_FuncAndArgLists] #create a queue object for each function
    jobs = [Process(target=storeOutputFFF,args=[funcArgs[0],funcArgs[1:],queues[iii]]) for iii,funcArgs in enumerate(listOf_FuncAndArgLists)]
    for job in jobs: job.start() # Launch them all
    for job in jobs: job.join() # Wait for them all to finish
    # And now, collect all the outputs:
    return([queue.get() for queue in queues])         

def storeOutputFFF(fff,theArgs,que): #add a argument to function for assigning a queue
    print 'MULTIPROCESSING: Launching %s in parallel '%fff.func_name
    que.put(fff(*theArgs)) #we're putting return value into queue

def parallel_function(frame_coordinates,i,img_frame_original):
    #do some image processing that takes about 20-30 ms
    dummyResult=np.argmax(img_frame_original)
    return(resultClass(dummyResult,i))

class resultClass(object):
    def __init__(self,maxIntensity,i):
        self.maxIntensity=maxIntensity
        self.i=i

if __name__ == '__main__':
    mp.freeze_support()
    a=main()
    [x.maxIntensity for x in a]

Parallel processing in (regular) python is a bit of a pain: in other languages we'd just use threads but the GIL makes that problematic, and using multiprocessing has a big overhead in moving data around. (常规)python中的并行处理有点痛苦:在其他语言中我们只使用线程,但GIL会使问题变得复杂,并且使用多处理在移动数据时会产生很大的开销。 I've found that fine-grained parallelism is (relatively) hard to do, whilst processing 'chunks' of work that take 10's of seconds (or more) to process in a single process can be much more straight-forward. 我发现,细粒度的并行性(相对)很难做到,而处理在一个进程中处理10秒钟(或更长时间)的“块”工作可能会更直截了当。

An easier path to parallel processing your problem - if you're on a UNIXy system - would be to make a python program which processes a segment of video specified on the command-line (ie a frame number to start with, and a number of frames to process), and then use the GNU parallel tool to process multiple segments at once. 并行处理问题的一个更简单的途径 - 如果你在UNIXy系统上 - 将是一个python程序,它处理在命令行上指定的一段视频(即一个帧号开头,以及一些待处理的帧),然后使用GNU并行工具一次处理多个段。 A second python program can consolidate the results from a collection of files, or reading from stdin, piped from parallel . 第二个python程序可以合并来自文件集合的结果,或者从parallel管道输入的stdin读取。 This way means that the processing code doesn't need to do it's own parallelism, but it does require the input file to be multiply accessed and to extract frames starting from mid-points. 这种方式意味着处理代码不需要执行它自己的并行性,但它确实需要多次访问输入文件并从中间点开始提取帧。 (This might also be extendable to work across multiple machines without changing the python...) (这也可以扩展到跨多台机器工作而不改变python ......)

Using multiprocessing.Pool.map could be used in a similar way if you need a pure-python solution: map over a list of tuples (say, (file, startframe, endframe) ) and then open the file in the function and process that segment. 如果需要pure-python解决方案,可以使用multiprocessing.Pool.map以类似的方式使用:映射元组列表(例如, (file, startframe, endframe) ),然后在函数中打开文件并处理分割。

Multiprocessing creates some overhead for starting several processes and bringing them all back together. 多处理会为启动多个进程并将它们全部重新组合在一起而产生一些开销。

Your code does that for every frame. 您的代码会为每个帧执行此操作。

Try splitting your video into N evenly-sized pieces and processing them in parallel. 尝试分裂您的视频分成N个大小均匀的块,并行处理它们

Put N equal to number of cores on your machine or something like that (your mileage may vary, but it's a good number to start experimenting with). 将N等于机器上的内核数量或类似内容(您的里程可能会有所不同,但这是一个很好的数字,可以开始尝试)。 There's no point in creating 50 processes if, say, 4 of them are getting executed and rest are simply waiting for their turn. 创建50个进程是没有意义的,例如,其中4个进程正在执行而休息只是等待轮到他们。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM