简体繁体 English

h.264字节流解析

[英]h.264 bytestream parsing

原文 2011-04-03 09:27:07 9 3 c++/ c/ video/ video-streaming/ h.264

The input data is a byte array which represents a h.264 frame. 输入数据是一个字节数组，表示h.264帧。 The frame consists of a single slice (not multislice frame). 该帧由单个切片（不是多切片帧）组成。

So, as I understood I can cope with this frame as with slice. 所以，据我所知，我可以像切片一样处理这个框架。 The slice has header, and slice data - macroblocks, each macroblock with its own header. 切片具有头部和切片数据 - 宏块，每个宏块具有其自己的头部。

So I have to parse that byte array to extract frame number, frame type, quantisation coefficient (as I understood each macroblock has its own coefficient? or I'm wrong?) 所以我必须解析该字节数组以提取帧数，帧类型，量化系数（据我所知，每个宏块都有自己的系数？或者我错了？）

Could You advise me, where I can get more detailed information about parsing h.264 frame bytes. 你可以告诉我，我可以在哪里获得有关解析h.264帧字节的更多详细信息。

(In fact I've read the standard, but it wasn't very specific, and I'm lost.) （事实上我已经阅读了标准，但它并不是非常具体，我迷路了。）

Thanks 谢谢

3 个解决方案

The H.264 Standard is a bit hard to read, so here are some tips. H.264标准有点难以阅读，所以这里有一些提示。

Read Annex B; 阅读附件B; make sure your input starts with a start code 确保您的输入以开始代码开头
Read section 9.1: you will need it for all of the following 阅读第9.1节：您将需要以下所有内容
Slice header is described in section 7.3.3 切片头在7.3.3节中描述
"Frame number" is not encoded explicitly in the slice header; “帧号”未在切片标题中明确编码; frame_num is close to what you probably want. frame_num接近你可能想要的。
"Frame type" probably corresponds to slice_type (the second value in the slice header, so most easy to parse; you should definitely start with this one) “帧类型”可能对应于slice_type （切片头中的第二个值，因此最容易解析;你应该从这个开始）
"Quantization coefficient" - do you mean "quantization parameter"? “量化系数” - 你的意思是“量化参数”？ If yes, be prepared to write a full H.264 parser (or reuse an existing one). 如果是，请准备编写完整的H.264解析器（或重用现有的解析器）。 Look in section 9.3 to get an idea on a complexity of a H.264 parser. 请参阅第9.3节，了解H.264解析器的复杂性。

Standard is very hard to read. 标准很难读。 You can try to analyze source code of existing H.264 video stream decoding software such as ffmpeg with it's C (C99) libraries. 您可以尝试使用它的C（C99）库分析现有H.264视频流解码软件（如ffmpeg）的源代码。 For example there is avcodec_decode_video2 function documented here . 例如，此处记录了 avcodec_decode_video2函数。 You can get full working C (open file, get H.264 stream, iterate thru frames, dump information, get colorspace, save frames as raw PPM images etc.) here . 您可以在此处获得完整的工作C（打开文件，获取H.264流，迭代帧，转储信息，获取色彩空间，将帧保存为原始PPM图像等）。 Alternatively there is great "The H.264 Advanced Video Compression Standard" book, which explains standard in "human language". 另外，还有很棒的“H.264高级视频压缩标准”一书，该书解释了“人类语言”中的标准。 Another option is to try Elecard StreamEye Pro software (there is trial version), which could give you some additional (visual) perspective. 另一个选择是尝试Elecard StreamEye Pro软件（有试用版），它可以为您提供一些额外的（视觉）视角。

Actually much better and easier (it is only my opinion) to read H.264 video coding documentation. 实际上阅读H.264视频编码文档要好得多，也更容易（仅限我的意见）。 ffmpeg is very good library but it contain a lot of optimized code. ffmpeg是非常好的库，但它包含很多优化的代码。 Better to look at reference implementation of the H.264 codec and official documentation. 最好看一下H.264编解码器和官方文档的参考实现。 http://iphome.hhi.de/suehring/tml/download/ - this is link to the JM codec implementation. http://iphome.hhi.de/suehring/tml/download/ - 这是JM编解码器实现的链接。 Try to separate levels of decoding process, like transport layer that contains NAL units (SPS, PPS, SEI, IDR, SLICE, etc). 尝试分离解码过程的级别，例如包含NAL单元的传输层（SPS，PPS，SEI，IDR，SLICE等）。 Than you need to implement VLC engine (mostly exp-Golomb codes of 0 range). 比你需要实现VLC引擎（主要是0范围的exp-Golomb代码）。 Than very difficult and powerful codec called CABAC (Context Adaptive Arithmetic Binary Codec). 比非常困难和强大的编解码器称为CABAC（Context Adaptive Arithmetic Binary Codec）。 It is quite tricky task. 这是一项非常棘手的任务。 Demuxing process (goes after unpacking of a video data) also complicated. 解复用过程（在解压缩视频数据之后）也很复杂。 You need completely understand each of such modules. 您需要完全理解每个模块。 Good luck. 祝好运。