简体   繁体   English

一种从视频中提取手的方法

[英]A way to extract hands from a video

I wonder whether it would be possible to extract only hands from a video with matlab. 我想知道是否可以使用Matlab从视频中仅提取手。 In the video hands perform some gesture. 在视频中执行一些手势。 Because first frames are only background I tried in this way: 因为第一帧只是背景,所以我尝试用这种方式:

readerObj = VideoReader('VideoWithHands.mp4');
nFrames = readerObj.NumberOfFrames;
fr = get(readerObj, 'FrameRate');
writerObj = VideoWriter('Hands.mp4', 'MPEG-4');
set(writerObj, 'FrameRate', fr);
open(writerObj);
bg = read(readerObj, 1);   %background
for k = 1 : nFrames
      frame = read(readerObj, k);
      hands = imabsdiff(frame,bg);
      writeVideo(writerObj,hands);
end
close(writerObj);

But I realized that colors of the hands are not "real" and they are transparent. 但是我意识到双手的颜色不是“真实的”,而是透明的。 Is there a better way to extract them from video keeping colors and opacity level exploiting the first frames (background)? 是否有更好的方法利用第一帧(背景)从保持色彩和不透明度的视频中提取它们?

EDIT: Well, I have found a good setting for vision.ForegroundDetector object, now hands are white logical regions but when I try to visualize them with: 编辑:嗯,我已经找到了理想的视觉设置。ForegroundDetector对象,现在手是白色的逻辑区域,但是当我尝试使用以下方法将其可视化时:

videoSource = vision.VideoFileReader('VideoWithHands.mp4', 'VideoOutputDataType', 'uint8');

detector = vision.ForegroundDetector('NumTrainingFrames', 46, 'InitialVariance', 4000, 'MinimumBackgroundRatio', 0.2);

videoplayer = vision.VideoPlayer();
hands = uint8(zeros(720,1280,3));
while ~isDone(videoSource)

    frame = step(videoSource);
    fgMask = step(detector, frame);

    [m,n] = find(fgMask);
    a = [m n];
    if isempty(a)==true

        hands(:,:,:) = uint8(zeros(720,1280,3));
    else


        hands(m,n,1) = frame(m,n,1);
        hands(m,n,2) = frame(m,n,2);
        hands(m,n,3) = frame(m,n,3);

    end



    step(videoplayer, hands)



end

release(videoplayer)
release(videoSource)

or put them into a videofile with: 或使用以下命令将它们放入视频文件中:

eaderObj = VideoReader('Video 9.mp4');
nFrames = readerObj.NumberOfFrames;
fr = get(readerObj, 'FrameRate');



writerObj = VideoWriter('hands.mp4', 'MPEG-4');

set(writerObj, 'FrameRate', fr);

detector = vision.ForegroundDetector('NumTrainingFrames', 46, 'InitialVariance', 4000, 'MinimumBackgroundRatio', 0.2);
open(writerObj);

bg = read(readerObj, 1);


frame = uint8(zeros(size(bg)));

for k = 1 : nFrames


frame = read(readerObj, k);

   fgMask =  step(detector, frame);


[m,n] = find(fgMask);

hands = uint8(zeros(720,1280));

if isempty([m n]) == true

    hands(:,:) = uint8(zeros(720,1280));

else

    hands(m,n) = frame(m,n);

end

 writeVideo(writerObj,mani);





end

close(writerObj);

...my PC crashes. ...我的电脑崩溃了。 Some suggestion? 有什么建议吗?

So you're trying to cancel out the background, making it black, right? 所以您要取消背景,将其设为黑色,对吗? The easiest way to do this should be to filter it, you can do that by comparing your difference data to a threshold value and then using the result as indices to set a custom background. 最简单的方法应该是过滤它,可以通过将差异数据与阈值进行比较,然后将结果用作索引来设置自定义背景来进行过滤。

filtered = imabsdiff(frame,bg);
bgindex = find( filtered < 10 );
frame(bgindex) = custombackground(bgindex);

where custombackground is whatever image file you want to put into the background. 其中custombackground是要放入背景中的任何图像文件。 If you want it to be just black or white, use 0 or 255 instead of custombackground(bgindex) . 如果希望它只是黑色或白色,请使用0或255而不是custombackground(bgindex) Note that the numbers depend on your video data's format and could be inaccurate (except 0, this one should always be right). 请注意,数字取决于您的视频数据格式,并且可能不准确(0除外,此数字应始终正确)。 If too much gets filtered out, lower the 10 above, if too much remains unfiltered, increase the 10 . 如果太多被过滤掉,则降低上面的10 ,如果仍有太多未过滤,增加10

At the end, you write your altered frame back into the video, so it just replaces the hands variable in your code. 最后,您将更改的帧写回到视频中,因此它仅替换了代码中的hands变量。

Also, depending on your format, you might have to do the comparison across RGB values. 另外,根据您的格式,您可能必须对RGB值进行比较。 This is slightly more complicated as it involves checking 3 values at the same time and doing some magic with the indices. 这稍微复杂一点,因为它需要同时检查3个值,并对索引做一些魔术。 This is the RGB version (works with anything containing 3 color bands): 这是RGB版本(适用于包含3个色带的任何东西):

filtered = imabsdiff(frame,bg); % differences at each pixel in each color band
totalfiltered = sum(filtered,3); % sums up the differences
                                 % in each color band (RGB)
bgindex = find( totalfiltered < 10 ); % extracts indices of pixels
                                      % with color close to bg
allind = sub2ind( [numel(totalfiltered),3] , repmat(bgindex,1,3) , ...
                  repmat(1:3,numel(bgindex),1) ); % index magic

frame(allind) = custombackground(allind); % copy custom background into frame

EDIT : 编辑:

Here's a detailed explanation of the index magic. 这是索引魔术的详细说明。

Let's assume a 50x50 image. 让我们假设一个50x50的图片。 Say the pixel at row 2, column 5 is found to be background, then bgindex will contain the number 202 (linear index corresponding to [2,5] = (5-1)*50+2 ). 假设发现第2行第5列的像素是背景,则bgindex将包含数字202(对应于[2,5] = (5-1)*50+2线性索引)。 What we need is a set of 3 indices corresponding to the matrix coordinates [2,5,1] , [2,5,2] and [2,5,3] . 我们需要一组与矩阵坐标[2,5,1][2,5,2][2,5,3]对应的3个索引。 That way, we can change all 3 color bands corresponding to that pixel. 这样,我们可以更改对应于该像素的所有3个色带。 To make calculations easier, this approach actually assumes linear indexing for the image and thus converts it to a 2500x1 image. 为了使计算更容易,此方法实际上假定了图像的线性索引,因此将其转换为2500x1图像。 Then it expands the 3 color bands, creating a 2500x3 matrix. 然后扩展3个色带,创建2500x3矩阵。 We now construct the indices [202,1] , [202,2] and [202,3] instead. 现在,我们[202,1]构造索引[202,1][202,2][202,3]

To do that, we first construct a matrix of indices by repeating our values. 为此,我们首先通过重复我们的值来构建索引矩阵。 repmat does this for us, it creates the matrices [202 202 202] and [1 2 3] . repmat为我们执行此操作,它创建了矩阵[202 202 202][1 2 3] If there were more pixels in bgindex , the first matrix would contain more rows, each repeating the linear pixel coordinates 3 times. 如果bgindex有更多像素,则第一个矩阵将包含更多行,每行重复线性像素坐标3次。 The second matrix would contain additional [1 2 3] rows. 第二个矩阵将包含其他[1 2 3]行。 The first argument to sub2ind is the size of the matrix, in this case, 2500x3, so we calculate the number of pixels with numel applied to the sum vector (which collapses the image's 3 bands into 1 value and thus has 1 value per pixel) and add a static 3 in the second dimension. sub2ind的第一个参数是矩阵的大小,在这种情况下为2500x3,因此我们计算了将numel应用于总和向量的像素数(该像素将图像的3个波段折叠为1个值,因此每个像素有1个值)并在第二维中添加一个静态3。

sub2ind now takes each element from the first matrix as a row index, each corresponding element from the second matrix as a column index and converts them to linear indices into a matrix of the size we determined earlier. 现在sub2ind将第一个矩阵中的每个元素作为行索引,将第二个矩阵中的每个对应元素作为列索引,并将它们转换为线性索引,转换成我们之前确定的大小的矩阵。 In our example, this results in the indices [202 2702 5202] . 在我们的示例中,这导致了索引[202 2702 5202] sub2ind preserves the shape of the inputs, so if we had 10 background pixels, this result would have the size 10x3. sub2ind保留输入的形状,因此,如果我们有10个背景像素,则此结果的大小将为10x3。 But since linear indexing doesn't care about the shape of the index matrix, it just takes all of those values. 但是,由于线性索引并不关心索引矩阵的形状,因此仅采用所有这些值。

To confirm this is correct, let's revert the values in the example. 为了确认这是正确的,让我们还原示例中的值。 The original image data would have the size 50x50x3. 原始图像数据的大小为50x50x3。 For an NxMxP matrix, a linear index to the subscript [nmp] can be calculated as ind = (p-1)*M*N + (m-1)*N + n . 对于NxMxP矩阵,下标[nmp]的线性索引可以计算为ind = (p-1)*M*N + (m-1)*N + n Using our values, we get the following: 使用我们的值,我们得到以下信息:

[2 5 1] => 202
[2 5 2] => 2702
[2 5 3] => 5202

ind2sub confirms this. ind2sub确认了这一点。

Yes, there is a better way. 是的,有更好的方法。 The computer vision system toolbox includes a vision.ForegroundDetector object that does what you need. 计算机视觉系统工具箱包含vision.ForegroundDetector您需求的vision.ForegroundDetector对象。 It implements the Gaussian Mixture Model algorithm for background subtraction. 它实现了高斯混合模型算法进行背景扣除。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从视频中提取轨迹/地面真相 - Extract trajectory/ground truth from video 从2D值矩阵识别扑克手 - Recognizing poker hands from a 2D matrix of values 如何从H.264视频提取比特流? - How to extract the bitstream from H.264 video? 如何使用Matlab从视频中以特定间隔提取帧 - How to extract frames at particular intervals from video using matlab 如何从具有 mkv 扩展名的视频中提取关键帧 - How to extract keyframes from video having mkv extension 有没有一种方法可以从向量中快速提取零件而不会循环? - Is there a way to quickly extract the parts from a vector without looping? 使用matlab从此字符串中提取数据的最简单方法是什么? - What is the easiest way to extract the data from this string with matlab? 如何使用MATLAB从视频中提取所有帧,以及如何通过任何标准方法提取关键帧? - How to extract all the frames from a video using MATLAB and do key frame extraction by any standard method? 如何使用Matlab从yuv 420视频剪辑中提取帧并将其存储为不同的图像? - How to extract frames from yuv 420 video clip and store them as different images, using matlab? 在Matlab中进行卷积 - Convolution in Matlab hands on
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM