简体   繁体   中英

Decode an H264 stream using the VideoToolbox API (kVTVideoDecoderBadDataErr)

my goal is to encode the main framebuffer of my Windows machine using nvenc and stream its content to my iPad using the VideoToolbox API

The code I use to encode the h264 stream is basically a copy/paste of https://github.com/NVIDIA/video-sdk-samples/tree/master/nvEncDXGIOutputDuplicationSample the only change is that instead of writing to a file, I do send the data

For the decoding I do use https://github.com/zerdzhong/SwfitH264Demo/blob/master/SwiftH264/ViewController.swift#L71

The encoding work perfectly when I write all the contents to a file, I am able to use a h264->mp4 online converter without issue, the problem is that the decoder gives me the error kVTVideoDecoderBadDataErr in the function decompressionSessionDecodeFrameCallback

So for what I tried:

  • Firsly using an h264 analyzer I found that the frame order are: 7/8/5/5/5/5/1...
  • I found that nvenc does encode the frames 7/8/5/5/5/5 in only one packet
  • I did try to separate this packet into multiple ones using the sequence (0x00 0x00 0x00 0x01), it gave me the frames 7/8/5 separately
  • As you can see I only got one 5 frame which is around 100KB, the H264 analyzer said that there are four 5 frames (which are something like 40KB, 20KB, 30KB, 10KB)
  • Using a hex file viewer I saw that the sequence separating those 5 frames were (0x00 0x00 0x01) instead, tried to also separate them but I got the exact same VideoToolbox error while decompressing

here is the code I use to separate and send the frames: The protocol is simply PACKET_SIZE->PACKET_DATA The swift code is able to read the NALU types so I am confident that this is not the issue

    unsafe {
        Setup();
        loop {
            CaptureFrame();

            let frame_count = GetDataCount();
            if frame_count == 0 {
                continue;
            }

            for i in 0..frame_count {
                let size = RetrieveDataSize(i as i32);
                let size_slice = &(u32::to_le_bytes(size as u32));

                let data = RetrieveData(i as i32);
                let data_slice = std::slice::from_raw_parts(data, size);

                let mut last_frame = 0;

                for x in 0..size {
                    if data_slice[x] == 0 &&
                        data_slice[x + 1] == 0 &&
                        data_slice[x + 2] == 0 &&
                        data_slice[x + 3] == 1 {
                        let frame_size = x - last_frame;
                        if frame_size > 0 {
                            let frame_data = &data_slice[last_frame..x];
                            stream.write(&(u32::to_le_bytes(frame_size as u32))).unwrap();
                            stream.write(frame_data).unwrap();
                            println!("SEND MULTIPLE {}", frame_size);
                        }

                        last_frame = x;
                        println!("NALU {}", data_slice[x + 4] & 0x1F);
                        //println!("TEST {} {}",i, size);
                        continue;
                    }
                }
                // Packet was a single frame
                let frame_size = size - last_frame;
                let frame_data = &data_slice[last_frame..size];
                stream.write(&(u32::to_le_bytes(frame_size as u32))).unwrap();
                stream.write(frame_data).unwrap();
                println!("SEND SINGLE {} {}", last_frame, size);
            }
        }
    }

It could be concerning the texture format, VideoToolbox makes mentioning of kCVPixelFormatType_420YpCbCr8BiPlanarFullRange, and the NVENC codes mentions YUV420 and NV12, I am unsure if both are the same or not

Here is my format description:

Optional(<CMVideoFormatDescription 0x2823dd410 [0x1e0921e20]> {
    mediaType:'vide' 
    mediaSubType:'avc1' 
    mediaSpecific: {
        codecType: 'avc1'       dimensions: 3840 x 2160 
    } 
    extensions: {{
    CVFieldCount = 1;
    CVImageBufferChromaLocationBottomField = Left;
    CVImageBufferChromaLocationTopField = Left;
    CVPixelAspectRatio =     {
        HorizontalSpacing = 1;
        VerticalSpacing = 1;
    };
    FullRangeVideo = 0;
    SampleDescriptionExtensionAtoms =     {
        avcC = {length = 41, bytes = 0x01640033 ffe10016 67640033 ac2b401e ... 68ee3cb0 fdf8f800 };
    };
}}
})

Alright so as weird as it sounds, my code does work on the simulator but not on my iPad pro. In the end it does work so I'll still mark it as the correct answer

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM