使用MediaCodec进行H264流媒体

Question

I'm currently trying to use Android as a Skype endpoint. 我目前正在尝试将Android用作Skype端点。 At this stage, I need to encode video into H.264 (since it's the only format supported by Skype) and encapsulate it with RTP in order to make the streaming work. 在这个阶段，我需要将视频编码为H.264（因为它是Skype支持的唯一格式）并将其封装在RTP中以使流式传输工作。

Apparently the MediaRecorder is not very suited for this for various reasons. 显然， MediaRecorder由于各种原因并不适合这种情况。 One is because it adds the MP4 or 3GP headers after it's finished. 一个是因为它在完成后添加了MP4或3GP标头。 Another is because in order to reduce latency to a minimum, hardware accelaration may come in handy. 另一个原因是为了将延迟降至最低，硬件加速可能会派上用场。 That's why I would like to make use of the recent low-level additions to the framework, being MediaCodec , MediaExtractor , etc. 这就是为什么我想利用最近对该框架的低级添加，即MediaCodec ， MediaExtractor等。

At the moment, I plan on working as follows. 目前，我计划如下工作。 The camera writes its video into a buffer. 相机将其视频写入缓冲区。 The MediaCodec encodes the video with H264 and writes the result to another buffer. MediaCodec使用H264对视频进行编码，并将结果写入另一个缓冲区。 This buffer is read by an RTP-encapsulator, which sends the stream data to the server. 该缓冲区由RTP封装器读取，该封装器将流数据发送到服务器。 Here's my first question: does this plan sounds feasible to you? 这是我的第一个问题：这个计划听起来对你有用吗？

Now I'm already stuck with step one. 现在我已经陷入第一步了。 Since all documentation on the internet about using the camera makes use of MediaRecorder , I cannot find a way to store its raw data into a buffer before encoding. 由于互联网上关于使用相机的所有文档都使用了MediaRecorder ，因此在编码之前我找不到将原始数据存储到缓冲区的方法。 Is addCallbackBuffer suited for this? addCallbackBuffer适合这个吗？ Anyone has a link with an example? 任何人都有一个例子的链接？

Next, I cannot find a lot of documentation about MediaCodec (since it's fairly new). 接下来，我找不到很多关于MediaCodec的文档（因为它相当新）。 Anyone who has a solid tutorial? 有固定教程的人吗？

Lastly: any recommendations on RTP libraries? 最后：关于RTP库的任何建议？

Thanks a lot in advance! 非常感谢提前！

Answer 1

UPDATE UPDATE
I was finally able to create proper RTP packages from the h264 frames. 我终于能够从h264帧创建适当的RTP包。 Here's what you have to keep in mind (it's actually quite simple): 这是你必须记住的（实际上很简单）：

The encoder does create NAL headers for each frame. 编码器确实为每个帧创建NAL标头。 But it returns each frame as a h264 bytestream . 但它将每个帧作为h264 字节流返回。 This means that each frame starts with three 0-bytes and a 1-byte. 这意味着每个帧以三个0字节和一个1字节开始。 All you have to do is remove those start prefixes, and put the frame into a RTP packet (or split them up using FU-As). 您所要做的就是删除那些开始前缀，并将帧放入RTP数据包（或使用FU-As将其拆分）。

Now to your questions: 现在回答你的问题：

I cannot find a way to store its raw data into a buffer before encoding. 在编码之前，我找不到将原始数据存储到缓冲区的方法。 Is addCallbackBuffer suited for this? addCallbackBuffer适合这个吗？

You should use camera.setPreviewCallback(...), and add each frame to the encoder. 您应该使用camera.setPreviewCallback（...），并将每个帧添加到编码器。

I cannot find a lot of documentation about MediaCodec (since it's fairly new). 我找不到很多关于MediaCodec的文档（因为它相当新）。 Anyone who has a solid tutorial? 有固定教程的人吗？

This should be a good introduction as to how the MediaCodec works. 这应该是关于MediaCodec如何工作的一个很好的介绍。 http://dpsm.wordpress.com/2012/07/28/android-mediacodec-decoded/ http://dpsm.wordpress.com/2012/07/28/android-mediacodec-decoded/

Lastly: any recommendations on RTP libraries? 最后：关于RTP库的任何建议？

I'm using jlibrtp which gets the job done. 我正在使用jlibrtp来完成工作。

Answer 2

I don't know anything about MediaCodec or MediaExtractor yet, but I am fairly familiar with MediaRecorder and have successfully implemented an RTSP server, based on SpyDroid, that captures H264/AMRNB output from MediaRecorder. 我对MediaCodec或MediaExtractor一无所知，但我对MediaRecorder非常熟悉，并成功实现了基于SpyDroid的RTSP服务器，该服务器捕获MediaRecorder的H264 / AMRNB输出。 The basic idea is that the code creates a local socket pair and uses setOutputFile of the MediaRecorder to write output to one of the sockets in the pair. 基本思想是代码创建一个本地套接字对，并使用MediaRecorder的setOutputFile将输出写入该对中的一个套接字。 Then, the program reads the video or audio stream from the other socket, parses it into packets, and then wraps each packet into one or more RTP packets which are sent over UDP. 然后，程序从另一个套接字读取视频或音频流，将其解析为数据包，然后将每个数据包包装成一个或多个通过UDP发送的RTP数据包。

It's true that MediaRecorder adds the MOOV headers after it's finished, but that's not a problem if you're serving H264 video in RTP format. MediaRecorder确实在它完成后添加了MOOV标头，但如果你以RTP格式提供H264视频，这不是问题。 Basically, there's an "mdat" header at the start of the video stream. 基本上，视频流的开头有一个“mdat”标题。 It has 4 bytes for the length of the header, followed by the 4 bytes "mdat". 它有4个字节作为标题的长度，后跟4个字节“mdat”。 Read the length to find out how long the header is, verify that it's the mdat header, and then skip the rest of the header data. 读取长度以找出标头的长度，验证它是mdat标头，然后跳过其余的标头数据。 From there on, you get a stream of NAL units, which start with 4 bytes for the unit length. 从那里开始，您将获得一个NAL单元流，其单位长度为4个字节。 Small NAL units can be sent in a single RTP packet, and larger units get broken up into FU packets. 小型NAL单元可以在单个RTP数据包中发送，较大的单元可以分解为FU数据包。 For RTSP, you also need to serve an SDP header that describes the stream. 对于RTSP，您还需要提供描述流的SDP标头。 SpyDroid calculates the info in the SDP header by writing a very short movie to file, and then reads this file to extract the MOOV header from the end. SpyDroid通过将非常短的电影写入文件来计算SDP标头中的信息，然后读取此文件以从末尾提取MOOV标头。 My app always uses the same size, format, and bit rate, so I just serve a static string: 我的应用程序总是使用相同的大小，格式和比特率，所以我只提供一个静态字符串：

public static final String SDP_STRING =
        "m=video 5006 RTP/AVP 96\n"
                + "b=RR:0\n"
                + "a=rtpmap:96 H264/90000\n"
                + "a=fmtp:96 packetization-mode=1;profile-level-id=428028;sprop-parameter-sets=Z0KAKJWgKA9E,aM48gA==;\n"
                + "a=control:trackID=0\n"
                + "m=audio 5004 RTP/AVP 96\n"
                + "b=AS:128\n"
                + "b=RR:0\n"
                + "a=rtpmap:96 AMR/8000\n"
                + "a=fmtp:96 octet-align=1;\n"
                + "a=control:trackID=1\n";

That's my header for 640x480x10fps, H264 video, with 8000/16/1 AMRNB audio. 这是我的标题为640x480x10fps，H264视频，具有8000/16/1 AMRNB音频。

One thing I can warn you about: If you're using MediaRecorder, your preview callback will never get called. 有一点我可以警告你：如果你正在使用MediaRecorder，你的预览回调永远不会被调用。 That only works in camera mode, not when you're recording video. 这仅适用于相机模式，而不适用于录制视频时。 I haven't been able to find any way of getting access to the preview image in uncompressed format while the video is recording. 在录制视频时，我无法找到以非压缩格式访问预览图像的任何方法。

I highly recommend looking over the code for SpyDroid. 我强烈建议查看SpyDroid的代码。 It takes some digging around, but I bet what you want is in there already. 这需要一些挖掘，但我打赌你想要的就是在那里。

Answer 3

What you plan is definetly feasible. 您的计划绝对可行。 You can register a Camera.PreviewCallback which takes the picture data and puts it into the MediaCodec. 您可以注册一个Camera.PreviewCallback，它将获取图片数据并将其放入MediaCodec。 You read the output and send it as RTP. 您读取输出并将其作为RTP发送。 In general it's easy, but there are various pitfalls as undocumented color spaces and different MediaCodec behaviour on different devices, but it's definetly possible. 一般情况下这很容易，但是在不同的设备上存在各种缺陷，如无证的颜色空间和不同的MediaCodec行为，但它绝对是可能的。

使用MediaCodec进行H264流媒体

问题描述

3 个解决方案

解决方案1
8 2013-07-26 10:23:41

解决方案2
6 2012-12-18 22:13:36

解决方案3
0 2013-02-12 10:08:45

使用MediaCodec进行H264流媒体

问题描述

3 个解决方案

解决方案1 8 2013-07-26 10:23:41

解决方案2 6 2012-12-18 22:13:36

解决方案3 0 2013-02-12 10:08:45

解决方案1
8 2013-07-26 10:23:41

解决方案2
6 2012-12-18 22:13:36

解决方案3
0 2013-02-12 10:08:45