简体   繁体   中英

Object Tracking in h.264 compressed video

I am working on a project that requires me to detect and track a human in a live video from a webcam connected to a Beagleboard xm. I have completed this task using Opencv in pixel domain. The results on the board are very accurate but extremely slow. Many people have suggested me to leave pixel domain and do the same task in an h.264/MPEG-4 compressed video as it would extremely reduce the computational overhead. I have read many research papers but failed to discover any software platform or a library that I can use to analyze and process h.264 compressed videos. I will be thankful if someone can suggest me some library for h.264 compressed video analysis and guide me further.

Thanks and Regards.

I'm not sure how practical this really is (I've never tried to do it), but my guess would be that what they're referring to would be looking for a block of macro-blocks that all have (nearly) identical motion vectors.

For example, let's assume you have a camera that's not panning, and the picture shows a car driving across the screen. Looking at the motion vectors, you should have a (roughly) car-shaped bunch of macro-blocks that all have similar motion vectors (denoting the motion of the car). Then, rather than look at the entire picture for your object of interest, you can look at that block in isolation and try to identify it. Likewise, if the camera was panning with the car, you'd have a car-shaped block with small motion vectors, and most of the background would have similar motion vectors in the opposite direction of the car's movement.

Note, however, that this is likely to be imprecise at best. Just for example, let's assume our mythical car as driving in front of a brick building, with its headlights illuminating some of the bricks. In this case, a brick in one picture might (easily) not point back at the same brick in the previous picture, but instead point at the brick in the previous picture that happened to be illuminated about the same. The bricks are enough alike that the closest match will depend more on illumination than the brick itself.

You may be able, eventually, to parse and determine that h.264 has an object, but this will not be "object tracking" like your looking for. openCV is excellent software and what it does best. Have you considered scaling the video down to a smaller resolution for easier analysis by openCV?

I think you are highly over estimating the computing power of this $45 computer. Object recognition and tracking is VERY hard computationally speaking. I would start by seeing how many frames per second your board can track and optimize from there. Start looking at where your bottlenecks are, you may be better off processing raw video instead of having to decode h.264 video first. Again, RAW video takes a LOT of RAM, and processing through that takes a LOT of CPU.

Minimize overhead from decoding video, minimize RAM overhead by scaling down the video before analysis, but in the end, your asking a LOT from a 1ghz, 32bit ARM processor.

FFMPEG is a very old library that is not being supported now a days. It has very limited capabilities in terms of processing and object tracking in h.264 compressed video. Most of the commands usually are outdated. The best thing would be to study h.264 thoroughly and then try to implement your own API in some language like Java or c#.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM