[英]How to pack Android MediaCodec encoded H264 into RTP packets
How can I properly pack a H264 byte stream into RTP packets so I can receive frames with FFMPEG?如何将 H264 字节流正确打包到 RTP 数据包中,以便我可以使用 FFMPEG 接收帧?
When I start the FFMPEG receiver, it pumps out a lot of errors like these:当我启动 FFMPEG 接收器时,它会产生很多这样的错误:
Invalid UE golomb code
[h264 @ 0xd63060] pps_id 3199971767 out of range
[h264 @ 0xd63060] slice type 32 too large at -1
[h264 @ 0xd63060] decode_slice_header error
[h264 @ 0xd63060] non-existing PPS 0 referenced
[h264 @ 0xd63060] decode_slice_header error
[h264 @ 0xd63060] no frame!
[h264 @ 0xd63060] decode_slice_header error
[h264 @ 0xd63060] Unknown NAL code: 0 (0 bits)
[h264 @ 0xd63060] no frame!
[h264 @ 0xd63060] non-existing PPS 0 referenced
Here is the SDP file I use:这是我使用的 SDP 文件:
c=IN IP4 192.168.2.30
t=0 0
m=video 51372 RTP/AVP 96
a=rtpmap:96 H264/90000
a=recv only
The pps_id error is curious, its as if its looking for the next PPS, but can't find it, although I tried embedding the PPS into each NALU. pps_id 错误很奇怪,它好像在寻找下一个 PPS,但找不到它,尽管我尝试将 PPS 嵌入到每个 NALU 中。
I've been reading RFC 6184 and trying to understand it.我一直在阅读RFC 6184并试图理解它。 But I feel I still don't quite understand how H264 and RTP interact.
但是我感觉我还是不太明白H264和RTP是如何交互的。 Currently I'm trying to encode pixels from a camera and stream 1920x1080 H264 encoded frames through RTP across the network where it is then received by FFMPEG and decoded.
目前,我正在尝试对来自相机的像素进行编码,并通过网络上的 RTP 传输 1920x1080 H264 编码的帧,然后由 FFMPEG 接收并解码。 I'm assembling the RTP and FU-A headers in Java and fragmenting the NALU when they are to large for the MTU.
我正在用 Java 组装 RTP 和 FU-A 标头,并在它们对于 MTU 来说太大时将 NALU 分段。
I've been watching the stream closely in Wireshark, here is the output of my first packet:我一直在 Wireshark 中密切关注流,这是我的第一个数据包的输出:
Real-Time Transport Protocol
10.. .... = Version: RFC 1889 Version (2)
..0. .... = Padding: False
...0 .... = Extension: False
.... 0000 = Contributing source identifiers count: 0
1... .... = Marker: True
Payload type: DynamicRTP-Type-96 (96)
Sequence number: 0
Timestamp: 2727179012
Synchronization Source identifier: 0x00000000 (0)
H.264
NAL unit header or first byte of the payload
0... .... = F bit: No bit errors or other syntax violations
.00. .... = Nal_ref_idc (NRI): 0
...0 0000 = Type: Undefined (0)
H264 NAL Unit Payload
I don't understand why the first payload has the the NALU type of 0. Nevertheless, here is my second packet:我不明白为什么第一个有效负载的 NALU 类型为 0。不过,这是我的第二个数据包:
Real-Time Transport Protocol
10.. .... = Version: RFC 1889 Version (2)
..0. .... = Padding: False
...0 .... = Extension: False
.... 0000 = Contributing source identifiers count: 0
0... .... = Marker: False
Payload type: DynamicRTP-Type-96 (96)
Sequence number: 1
Timestamp: 2727179019
Synchronization Source identifier: 0x00000000 (0)
H.264
FU identifier
0... .... = F bit: No bit errors or other syntax violations
.11. .... = Nal_ref_idc (NRI): 3
...1 1100 = Type: Fragmentation unit A (FU-A) (28)
FU Header
1... .... = Start bit: the first packet of FU-A picture
.0.. .... = End bit: Not the last packet of FU-A picture
..0. .... = Forbidden bit: 0
...0 0101 = Nal_unit_type: Coded slice of an IDR picture (5)
H264 NAL Unit Payload
0000 0000 0000 0000 0000 0000 0000 0001 0110 0101 1011 1000 0000 0100 0000 010. = first_mb_in_slice: 3000762881
.... ...1 = slice_type: P (P slice) (0)
0011 1... = pic_parameter_set_id: 6
So I think the last packet was a I-Frame?所以我认为最后一个数据包是 I 帧? Here is a fragment between the start and end fragments:
这是开始和结束片段之间的片段:
Real-Time Transport Protocol
10.. .... = Version: RFC 1889 Version (2)
..0. .... = Padding: False
...0 .... = Extension: False
.... 0000 = Contributing source identifiers count: 0
0... .... = Marker: False
Payload type: DynamicRTP-Type-96 (96)
Sequence number: 1
Timestamp: 2727179019
Synchronization Source identifier: 0x00000000 (0)
H.264
FU identifier
0... .... = F bit: No bit errors or other syntax violations
.11. .... = Nal_ref_idc (NRI): 3
...1 1100 = Type: Fragmentation unit A (FU-A) (28)
FU Header
0... .... = Start bit: Not the first packet of FU-A picture
.0.. .... = End bit: Not the last packet of FU-A picture
..0. .... = Forbidden bit: 0
...0 0101 = Nal_unit_type: Coded slice of an IDR picture (5)
And of course here is the last packet of the supposed I-Frame:当然,这里是假设的 I-Frame 的最后一个数据包:
Real-Time Transport Protocol
10.. .... = Version: RFC 1889 Version (2)
..0. .... = Padding: False
...0 .... = Extension: False
.... 0000 = Contributing source identifiers count: 0
1... .... = Marker: True
Payload type: DynamicRTP-Type-96 (96)
Sequence number: 1
Timestamp: 2727179019
Synchronization Source identifier: 0x00000000 (0)
H.264
FU identifier
0... .... = F bit: No bit errors or other syntax violations
.11. .... = Nal_ref_idc (NRI): 3
...1 1100 = Type: Fragmentation unit A (FU-A) (28)
FU Header
0... .... = Start bit: Not the first packet of FU-A picture
.1.. .... = End bit: the last packet of FU-A picture
..0. .... = Forbidden bit: 0
...0 0101 = Nal_unit_type: Coded slice of an IDR picture (5)
Now here is my packet for the next bytes the encoder gave me:现在这是编码器给我的下一个字节的数据包:
Real-Time Transport Protocol
10.. .... = Version: RFC 1889 Version (2)
..0. .... = Padding: False
...0 .... = Extension: False
.... 0000 = Contributing source identifiers count: 0
0... .... = Marker: False
Payload type: DynamicRTP-Type-96 (96)
Sequence number: 2
Timestamp: 2727179089
Synchronization Source identifier: 0x00000000 (0)
H.264
FU identifier
0... .... = F bit: No bit errors or other syntax violations
.11. .... = Nal_ref_idc (NRI): 3
...1 1100 = Type: Fragmentation unit A (FU-A) (28)
FU Header
1... .... = Start bit: the first packet of FU-A picture
.0.. .... = End bit: Not the last packet of FU-A picture
..0. .... = Forbidden bit: 0
...0 0001 = Nal_unit_type: Coded slice of a non-IDR picture (1)
H264 NAL Unit Payload
0000 0000 0000 0000 0000 0000 0000 0001 0110 0001 1110 0000 0010 0000 0001 100. = first_mb_in_slice: 2968522763
.... ...0 0111 .... = slice_type: B (B slice) (6)
.... 0001 110. .... = pic_parameter_set_id: 13
This part confuses me, when the camera is stationary, the encoder gives me smaller and smaller NALU with undefined types, and I'm not entirely sure why, anyways, the packet below gets sent as one whole NALU to FFMPEG.这部分让我感到困惑,当相机静止时,编码器给我提供了越来越小的未定义类型的 NALU,我不完全确定为什么,无论如何,下面的数据包作为一个完整的 NALU 被发送到 FFMPEG。
Real-Time Transport Protocol
10.. .... = Version: RFC 1889 Version (2)
..0. .... = Padding: False
...0 .... = Extension: False
.... 0000 = Contributing source identifiers count: 0
1... .... = Marker: True
Payload type: DynamicRTP-Type-96 (96)
Sequence number: 36
Timestamp: 2727180258
Synchronization Source identifier: 0x00000000 (0)
H.264
NAL unit header or first byte of the payload
0... .... = F bit: No bit errors or other syntax violations
.00. .... = Nal_ref_idc (NRI): 0
...0 0000 = Type: Undefined (0)
H264 NAL Unit Payload
I'm using Android MediaCodec encoder, and here is some code where I configure the encoder:我正在使用 Android MediaCodec 编码器,这里是一些我配置编码器的代码:
mediaCodec = MediaCodec.createByCodecName("OMX.Nvidia.h264.encoder");
mediaFormat = MediaFormat.createVideoFormat("video/avc", 1920, 1080);
mediaFormat.setInteger(MediaFormat.KEY_BIT_RATE, 125000);
mediaFormat.setInteger(MediaFormat.KEY_FRAME_RATE, 30);
mediaFormat.setInteger(MediaFormat.KEY_COLOR_FORMAT, MediaCodecInfo.CodecCapabilities.COLOR_FormatSurface);
mediaFormat.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, 0);
mediaFormat.setInteger(MediaFormat.KEY_MAX_INPUT_SIZE, 1920 * 1080);
Is the encoder giving me whole access units or only NALU?编码器是给我整个访问单元还是只给我 NALU?
Here is my logic:这是我的逻辑:
I feel like I'm close, but its clearly not working for any RTP receivers.我觉得我很接近,但它显然不适用于任何 RTP 接收器。 I appreciate any thoughts or ideas on the matter.
我感谢任何关于此事的想法或想法。
Thanks,谢谢,
I finally managed to work it out, my packets were not configured properly.我终于设法解决了,我的数据包配置不正确。
I can even start FFmpeg in the middle of the stream and it works!我什至可以在流的中间启动 FFmpeg 并且它可以工作!
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.