简体   繁体   English

硬件加速h.264解码到iOS中的纹理,叠加或类似

[英]Hardware accelerated h.264 decoding to texture, overlay or similar in iOS

Is it possible, and supported, to use the iOS hardware accelerated h.264 decoding API to decode a local (not streamed) video file, and then compose other objects on top of it? 是否有可能并且支持使用iOS硬件加速的h.264解码API来解码本地(非流式)视频文件,然后在其上组合其他对象?

I would like to make an application that involves drawing graphical objects in front of a video, and use the playback timer to synchronize what I am drawing on top, to what is being played on the video. 我想创建一个涉及在视频前面绘制图形对象的应用程序,并使用回放计时器来同步我在顶部绘制的内容,以及在视频上播放的内容。 Then, based on the user's actions, change what I am drawing on top (but not the video) 然后,根据用户的操作,更改我在顶部绘制的内容(但不是视频)

Coming from DirectX, OpenGL and OpenGL ES for Android, I am picturing something like rendering the video to a texture, and using that texture to draw a full screen quad, then use other sprites to draw the rest of the objects; 来自Android的DirectX,OpenGL和OpenGL ES,我想象的是将视频渲染为纹理,并使用该纹理绘制全屏四边形,然后使用其他精灵来绘制其余的对象; or maybe writing an intermediate filter just before the renderer, so I can manipulate the individual output frames and draw my stuff; 或者也许在渲染器之前写一个中间过滤器,这样我就可以操纵各个输出帧并绘制我的东西; or maybe drawing to a 2D layer on top of the video. 或者可以在视频顶部绘制2D图层。

It seems like AV Foundation, or Core Media may help me do what I am doing, but before I dig into the details, I would like to know if it is possible at all to do what I want to do, and what are my main routes to approach the problem. 似乎AV基金会或者Core Media 可以帮助我做我正在做的事情,但在我深入研究细节之前,我想知道是否有可能做我想做的事情,我的主要内容是什么?解决问题的途径。

Please refrain from "this is too advanced for you, try hello world first" answers. 请不要“这对你来说太先进,先试试你好世界”的答案。 I know my stuff, and just want to know if what I want to do is possible (and most importantly, supported, so the app won't get eventually rejected), before I study the details by myself. 我知道我的东西,只是想知道我想做什么是可能的(最重要的是,支持,所以应用程序不会最终被拒绝),在我自己研究细节之前。

edit: 编辑:

I am not knowledgeable in iOS development, but professionally do DirectX, OpenGL and OpenGL ES for Android. 我不熟悉iOS开发,但专业的Android版DirectX,OpenGL和OpenGL ES。 I am considering making an iOS version of an Android application I currently have, and I just want to know if this is possible. 我正在考虑制作我目前拥有的Android应用程序的iOS版本,我只是想知道这是否可行。 If so, I have enough time to start iOS development from scratch, up to doing what I want to do. 如果是这样,我有足够的时间从头开始iOS开发,直到做我想做的事情。 If it is not possible, then I will just not invest time studying the entire platform at this time. 如果不可能,那么我现在不会花时间研究整个平台。

Therefore, this is a technical feasibility question. 因此,这是一个技术可行性问题。 I am not requesting code. 我不是要求代码。 I am looking for answers of the type "Yes, you can do that. Just use A and B, use C to render into D and draw your stuff with E", or "No, you can't. The hardware accelerated decoding is not available for third-party applications" (which is what a friend told me). 我正在寻找类型的答案“是的,你可以做到这一点。只需使用A和B,使用C渲染到D并用E绘制你的东西”,或者“不,你不能。硬件加速解码是不适用于第三方应用程序“(这是朋友告诉我的)。 Just this, and I'll be on my way. 就这样,我会在路上。

I have read the overview for the video technologies in page 32 of the ios technology overview . 我已经阅读了ios技术概述第32页中的视频技术概述 It pretty much says that I can use Media Player for the most simple playback functionality (not what I'm looking for), UIKit for embedding videos with a little more control over the embedding, but not over the actual playback (not what I'm looking for), AVFoundation for more control over playback (maybe what I need, but most of the resources I find online talk about how to use the camera), or Core Media to have full low-level control over video (probably what I need, but extremely poorly documented , and even more lacking in resources on playback than even AVFoundation). 它几乎说我可以使用Media Player获得最简单的播放功能(不是我正在寻找的),UIKit用于嵌入视频,对嵌入有更多的控制,但不是实际播放(不是我的'我正在寻找),AVFoundation可以更好地控制播放(也许我需要的,但我在网上找到的大部分资源都是关于如何使用相机),或者是Core Media对视频进行全面的低级控制(可能就是我需要,但记录极差 ,甚至比AVFoundation更缺乏回放资源。

I am concerned that I may dedicate the next six months to learn iOS programming full time, only to find at the end that the relevant API is not available for third party developers, and what I want to do is unacceptable for iTunes store deployment. 我担心我可能会在接下来的六个月里全身心地学习iOS编程,但最终却发现相关的API不适用于第三方开发人员,而我想做的事情对于iTu​​nes商店部署来说是不可接受的。 This is what my friend told me, but I can't seem to find anything relevant in the app development guidelines. 这是我的朋友告诉我的,但我似乎无法在应用程序开发指南中找到任何相关内容。 Therefore, I came here to ask people who have more experience in this area, whether or not what I want to do is possible. 因此,我来​​到这里是为了询问那些在这方面有更多经验的人,不管我想做什么都是可能的。 No more. 不再。

I consider this a valid high level question, which can be misunderstood as an I-didn't-do-my-homework-plz-give-me-teh-codez question. 我认为这是一个有效的高级问题,可以被误解为我不做我的作业-plz-give-me-teh-codez问题。 If my judgement in here was mistaken, feel free to delete, or downvote this question to your heart's contempt. 如果我在这里的判断是错误的,请随意删除,或者将这个问题贬低到你心中的蔑视。

Yes, you can do this, and I think your question was specific enough to belong here. 是的,你可以做到这一点,我认为你的问题具体到足以属于这里。 You're not the only one who has wanted to do this, and it does take a little digging to figure out what you can and can't do. 你并不是唯一一个想要做到这一点的人,而且需要花一些时间来弄清楚你能做什么和不做什么。

AV Foundation lets you do hardware-accelerated decoding of H.264 videos using an AVAssetReader, at which point you're handed the raw decoded frames of video in BGRA format. AV Foundation允许您使用AVAssetReader对H.264视频进行硬件加速解码,此时您将以BGRA格式传送原始解码的视频帧。 These can be uploaded to a texture using either glTexImage2D() or the more efficient texture caches in iOS 5.0. 这些可以使用glTexImage2D()或iOS 5.0中更高效的纹理缓存上传到纹理。 From there, you can process for display or retrieve the frames from OpenGL ES and use an AVAssetWriter to perform hardware-accelerated H.264 encoding of the result. 从那里,您可以处理显示或从OpenGL ES检索帧,并使用AVAssetWriter对结果执行硬件加速的H.264编码。 All of this uses public APIs, so at no point do you get anywhere near something that would lead to a rejection from the App Store. 所有这些都使用公共API,所以在任何地方都不会出现任何可能导致App Store拒绝的事情。

However, you don't have to roll your own implementation of this. 但是,您不必滚动自己的实现。 My BSD-licensed open source framework GPUImage encapsulates these operations and handles all of this for you. 我的BSD许可开源框架GPUImage封装了这些操作并为您处理所有这些操作。 You create a GPUImageMovie instance for your input H.264 movie, attach filters onto it (such as overlay blends or chroma keying operations), and then attach these filters to a GPUImageView for display and/or a GPUImageMovieWriter to re-encode an H.264 movie from the processed video. 您为输入H.264影片创建GPUImageMovie实例,将滤镜附加到其上(例如叠加混合或色度键控操作),然后将这些滤镜附加到GPUImageView进行显示和/或将GPUImageMovieWriter重新编码为H.来自已处理视频的264部电影。

The one issue I currently have is that I don't obey the timestamps in the video for playback, so frames are processed as quickly as they are decoded from the movie. 我目前遇到的一个问题是我不遵守视频中的时间戳进行播放,因此帧的处理速度与从电影中解码的速度一样快。 For filtering and re-encoding of a video, this isn't a problem, because the timestamps are passed through to the recorder, but for direct display to the screen this means that the video can be sped up by as much as 2-4X. 对于视频的过滤和重新编码,这不是问题,因为时间戳传递到记录器,但是为了直接显示到屏幕,这意味着视频可以加速2-4倍。 I'd welcome any contributions that would let you synchronize the playback rate to the actual video timestamps. 我欢迎任何可以让您将播放速率与实际视频时间戳同步的贡献。

I can currently play back, filter, and re-encode 640x480 video at well over 30 FPS on an iPhone 4 and 720p video at ~20-25 FPS, with the iPhone 4S being capable of 1080p filtering and encoding at significantly higher than 30 FPS. 我现在可以在iPhone 4和720p视频上以大约20-25 FPS以超过30 FPS回放,过滤和重新编码640x480视频,iPhone 4S能够以高于30 FPS的速度进行1080p过滤和编码。 Some of the more expensive filters can tax the GPU and slow this down a bit, but most filters operate in these framerate ranges. 一些较昂贵的过滤器可能会对GPU造成负担并使其慢一点,但大多数过滤器都在这些帧速率范围内运行。

If you want, you can examine the GPUImageMovie class to see how it does this uploading to OpenGL ES, but the relevant code is as follows: 如果需要,可以检查GPUImageMovie类以查看它如何上传到OpenGL ES,但相关代码如下:

- (void)startProcessing;
{
    NSDictionary *inputOptions = [NSDictionary dictionaryWithObject:[NSNumber numberWithBool:YES] forKey:AVURLAssetPreferPreciseDurationAndTimingKey];
    AVURLAsset *inputAsset = [[AVURLAsset alloc] initWithURL:self.url options:inputOptions];

    [inputAsset loadValuesAsynchronouslyForKeys:[NSArray arrayWithObject:@"tracks"] completionHandler: ^{
        NSError *error = nil;
        AVKeyValueStatus tracksStatus = [inputAsset statusOfValueForKey:@"tracks" error:&error];
        if (!tracksStatus == AVKeyValueStatusLoaded) 
        {
            return;
        }
        reader = [AVAssetReader assetReaderWithAsset:inputAsset error:&error];

        NSMutableDictionary *outputSettings = [NSMutableDictionary dictionary];
        [outputSettings setObject: [NSNumber numberWithInt:kCVPixelFormatType_32BGRA]  forKey: (NSString*)kCVPixelBufferPixelFormatTypeKey];
        // Maybe set alwaysCopiesSampleData to NO on iOS 5.0 for faster video decoding
        AVAssetReaderTrackOutput *readerVideoTrackOutput = [AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:[[inputAsset tracksWithMediaType:AVMediaTypeVideo] objectAtIndex:0] outputSettings:outputSettings];
        [reader addOutput:readerVideoTrackOutput];

        NSArray *audioTracks = [inputAsset tracksWithMediaType:AVMediaTypeAudio];
        BOOL shouldRecordAudioTrack = (([audioTracks count] > 0) && (self.audioEncodingTarget != nil) );
        AVAssetReaderTrackOutput *readerAudioTrackOutput = nil;

        if (shouldRecordAudioTrack)
        {            
            audioEncodingIsFinished = NO;

            // This might need to be extended to handle movies with more than one audio track
            AVAssetTrack* audioTrack = [audioTracks objectAtIndex:0];
            readerAudioTrackOutput = [AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:audioTrack outputSettings:nil];
            [reader addOutput:readerAudioTrackOutput];
        }

        if ([reader startReading] == NO) 
        {
            NSLog(@"Error reading from file at URL: %@", self.url);
            return;
        }

        if (synchronizedMovieWriter != nil)
        {
            __unsafe_unretained GPUImageMovie *weakSelf = self;

            [synchronizedMovieWriter setVideoInputReadyCallback:^{
                [weakSelf readNextVideoFrameFromOutput:readerVideoTrackOutput];
            }];

            [synchronizedMovieWriter setAudioInputReadyCallback:^{
                [weakSelf readNextAudioSampleFromOutput:readerAudioTrackOutput];
            }];

            [synchronizedMovieWriter enableSynchronizationCallbacks];
        }
        else
        {
            while (reader.status == AVAssetReaderStatusReading) 
            {
                [self readNextVideoFrameFromOutput:readerVideoTrackOutput];

                if ( (shouldRecordAudioTrack) && (!audioEncodingIsFinished) )
                {
                    [self readNextAudioSampleFromOutput:readerAudioTrackOutput];
                }

            }            

            if (reader.status == AVAssetWriterStatusCompleted) {
                [self endProcessing];
            }
        }
    }];
}

- (void)readNextVideoFrameFromOutput:(AVAssetReaderTrackOutput *)readerVideoTrackOutput;
{
    if (reader.status == AVAssetReaderStatusReading)
    {
        CMSampleBufferRef sampleBufferRef = [readerVideoTrackOutput copyNextSampleBuffer];
        if (sampleBufferRef) 
        {
            runOnMainQueueWithoutDeadlocking(^{
                [self processMovieFrame:sampleBufferRef]; 
            });

            CMSampleBufferInvalidate(sampleBufferRef);
            CFRelease(sampleBufferRef);
        }
        else
        {
            videoEncodingIsFinished = YES;
            [self endProcessing];
        }
    }
    else if (synchronizedMovieWriter != nil)
    {
        if (reader.status == AVAssetWriterStatusCompleted) 
        {
            [self endProcessing];
        }
    }
}

- (void)processMovieFrame:(CMSampleBufferRef)movieSampleBuffer; 
{
    CMTime currentSampleTime = CMSampleBufferGetOutputPresentationTimeStamp(movieSampleBuffer);
    CVImageBufferRef movieFrame = CMSampleBufferGetImageBuffer(movieSampleBuffer);

    int bufferHeight = CVPixelBufferGetHeight(movieFrame);
    int bufferWidth = CVPixelBufferGetWidth(movieFrame);

    CFAbsoluteTime startTime = CFAbsoluteTimeGetCurrent();

    if ([GPUImageOpenGLESContext supportsFastTextureUpload])
    {
        CVPixelBufferLockBaseAddress(movieFrame, 0);

        [GPUImageOpenGLESContext useImageProcessingContext];
        CVOpenGLESTextureRef texture = NULL;
        CVReturn err = CVOpenGLESTextureCacheCreateTextureFromImage(kCFAllocatorDefault, coreVideoTextureCache, movieFrame, NULL, GL_TEXTURE_2D, GL_RGBA, bufferWidth, bufferHeight, GL_BGRA, GL_UNSIGNED_BYTE, 0, &texture);

        if (!texture || err) {
            NSLog(@"Movie CVOpenGLESTextureCacheCreateTextureFromImage failed (error: %d)", err);  
            return;
        }

        outputTexture = CVOpenGLESTextureGetName(texture);
        //        glBindTexture(CVOpenGLESTextureGetTarget(texture), outputTexture);
        glBindTexture(GL_TEXTURE_2D, outputTexture);
        glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
        glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
        glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
        glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);

        for (id<GPUImageInput> currentTarget in targets)
        {            
            NSInteger indexOfObject = [targets indexOfObject:currentTarget];
            NSInteger targetTextureIndex = [[targetTextureIndices objectAtIndex:indexOfObject] integerValue];

            [currentTarget setInputSize:CGSizeMake(bufferWidth, bufferHeight) atIndex:targetTextureIndex];
            [currentTarget setInputTexture:outputTexture atIndex:targetTextureIndex];

            [currentTarget newFrameReadyAtTime:currentSampleTime];
        }

        CVPixelBufferUnlockBaseAddress(movieFrame, 0);

        // Flush the CVOpenGLESTexture cache and release the texture
        CVOpenGLESTextureCacheFlush(coreVideoTextureCache, 0);
        CFRelease(texture);
        outputTexture = 0;        
    }
    else
    {
        // Upload to texture
        CVPixelBufferLockBaseAddress(movieFrame, 0);

        glBindTexture(GL_TEXTURE_2D, outputTexture);
        // Using BGRA extension to pull in video frame data directly
        glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, bufferWidth, bufferHeight, 0, GL_BGRA, GL_UNSIGNED_BYTE, CVPixelBufferGetBaseAddress(movieFrame));

        CGSize currentSize = CGSizeMake(bufferWidth, bufferHeight);
        for (id<GPUImageInput> currentTarget in targets)
        {
            NSInteger indexOfObject = [targets indexOfObject:currentTarget];
            NSInteger targetTextureIndex = [[targetTextureIndices objectAtIndex:indexOfObject] integerValue];

            [currentTarget setInputSize:currentSize atIndex:targetTextureIndex];
            [currentTarget newFrameReadyAtTime:currentSampleTime];
        }
        CVPixelBufferUnlockBaseAddress(movieFrame, 0);
    }

    if (_runBenchmark)
    {
        CFAbsoluteTime currentFrameTime = (CFAbsoluteTimeGetCurrent() - startTime);
        NSLog(@"Current frame time : %f ms", 1000.0 * currentFrameTime);
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM