使用ffmpeg（libavcodec）通過RTP解碼H264視頻的問題

Question

我使用SDP的profile-level-id et sprop-parameter-set設置AvCodecContext的profile_idc，level_idc，extradata和extradata_size。

我將Coded Slice，SPS，PPS和NAL_IDR_SLICE數據包的解碼分開：

在里面：

uint8_t start_sequence [] = {0,0,1}; int size = recv（id_de_la_socket，（char *）rtpReceive，65535,0）;

編碼切片：

char *z = new char[size-16+sizeof(start_sequence)];
    memcpy(z,&start_sequence,sizeof(start_sequence));
    memcpy(z+sizeof(start_sequence),rtpReceive+16,size-16);
    ConsumedBytes = avcodec_decode_video(codecContext,pFrame,&GotPicture,(uint8_t*)z,size-16+sizeof(start_sequence));
    delete z;

結果：ConsumedBytes> 0且GotPicture> 0（經常）

SPS和PPS：

相同的代碼。 結果：ConsumedBytes> 0且GotPicture = 0

我認為這是正常的

當我找到一對新的SPS / PPS時，我使用此數據包的有效負載及其大小更新extradata和extrada_size。

NAL_IDR_SLICE：

Nal單元類型是28 => idr幀被分段為此我嘗試了兩種方法來解碼

1）我在第一個片段（沒有RTP頭）前加上序列0x000001，並將其發送到avcodec_decode_video。 然后我將剩下的片段發送到這個函數。

2）我將第一個片段（沒有RTP頭）加上序列0x000001的前綴，並將其余的片段連接到它。 我把這個緩沖區發送給解碼器。

在這兩種情況下，我都沒有錯誤（ConsumedBytes> 0）但我沒有檢測到任何幀（GotPicture = 0）......

問題是什么？

Answer 1

在RTP中，所有H264 I幀（IDR）通常都是碎片化的。 當您收到RTP時，首先必須跳過標題（通常是前12個字節），然后轉到NAL單元（第一個有效負載字節）。 如果NAL是28（1C）那么這意味着跟隨有效載荷代表一個H264 IDR（I幀）片段，並且您需要收集所有這些片段以重建H264 IDR（I幀）。

由於有限的MTU和更大的IDR，會發生碎片。 一個片段可能如下所示：

START BIT = 1的片段：

First byte:  [ 3 NAL UNIT BITS | 5 FRAGMENT TYPE BITS] 
Second byte: [ START BIT | END BIT | RESERVED BIT | 5 NAL UNIT BITS] 
Other bytes: [... IDR FRAGMENT DATA...]

其他片段：

First byte:  [ 3 NAL UNIT BITS | 5 FRAGMENT TYPE BITS]  
Other bytes: [... IDR FRAGMENT DATA...]

要重建IDR，您必須收集此信息：

int fragment_type = Data[0] & 0x1F;
int nal_type = Data[1] & 0x1F;
int start_bit = Data[1] & 0x80;
int end_bit = Data[1] & 0x40;

如果fragment_type == 28則跟隨它的有效載荷是IDR的一個片段。 接下來檢查是start_bit set，如果是，則該片段是序列中的第一個片段。 您可以通過從第一個有效負載字節(3 NAL UNIT BITS)獲取前3個比特來重建IDR的NAL字節，並將它們與第二個有效負載字節的最后5個比特(5 NAL UNIT BITS)組合在一起，這樣您就可以獲得這樣的字節[3 NAL UNIT BITS | 5 NAL UNIT BITS] [3 NAL UNIT BITS | 5 NAL UNIT BITS] 。 然后將該NAL字節首先寫入一個清除緩沖區，該緩沖區包含該片段中的所有其他后續字節。 請記住跳過序列中的第一個字節，因為它不是IDR的一部分，而只是識別片段。

如果start_bit和end_bit為0，則只需將有效負載（跳過標識該片段的第一個有效負載字節）寫入緩沖區。

如果start_bit為0且end_bit為1，則表示它是最后一個片段，您只需將其有效負載（跳過標識該片段的第一個字節）寫入緩沖區，現在您已重建IDR。

如果你需要一些代碼，請在評論中提問，我會發布它，但我認為這很清楚怎么做... =）

關於解碼

我今天想到你為什么在解碼IDR時遇到錯誤（我認為你已經重建好了）。 您是如何構建AVC解碼器配置記錄的？ 您使用的lib是否具有自動化功能？ 如果沒有，你還沒有聽說過，繼續閱讀......

指定AVCDCR允許解碼器快速解析解碼H264（AVC）視頻流所需的所有數據。 數據如下：

ProfileIDC
ProfileIOP
LevelIDC
SPS（序列參數集）
PPS（圖片參數集）

所有這些數據都在SDP中的RTSP會話中在以下字段中發送： profile-level-id和sprop-parameter-sets 。

解碼PROFILE-LEVEL-ID

Prifile級別ID字符串分為3個子字符串，每個字符串長2個字符：

[PROFILE IDC][PROFILE IOP][LEVEL IDC]

每個子字符串代表base16中的一個字節！ 因此，如果Profile IDC為28，則表示它在base10中實際為40。 稍后您將使用base10值來構造AVC解碼器配置記錄。

解碼SPROP-PARAMETER-SETS

Sprops通常是2個字符串（可能更多），以逗號分隔，並且base64編碼 ！ 你可以解碼它們，但沒有必要。 你的工作就是將它們從base64字符串轉換為字節數組供以后使用。 現在你有2個字節的數組，第一個數組是SPS，第二個是PPS。

建立AVCDCR

現在，您已經擁有構建AVCDCR所需的一切，您可以從創建新的干凈緩沖區開始，現在按照此處說明的順序將這些內容寫入其中：

1 - 具有值1並表示版本的字節

2 - 配置文件IDC字節

3 - Prifile IOP字節

4 - 級別IDC字節

5 - 值為0xFF的字節（谷歌AVC解碼器配置記錄，看看這是什么）

6 - 字節值0xE1

7 - SPS陣列長度的值短

8 - SPS字節數組

9 - 具有PPS陣列數量的字節（在sprop-parameter-set中可以有更多它們）

10 - 跟隨PPS陣列的長度短

11 - PPS陣列

解碼視頻流

現在你有字節數組告訴解碼器如何解碼H264視頻流。 我相信你需要這個，如果你的lib不是自己從SDP構建它...

Answer 2

我有一個針對c＃的@ https://net7mma.codeplex.com/的實現，但是到處都是相同的過程。

這是相關的代碼

/// <summary>
    /// Implements Packetization and Depacketization of packets defined in <see href="https://tools.ietf.org/html/rfc6184">RFC6184</see>.
    /// </summary>
    public class RFC6184Frame : Rtp.RtpFrame
    {
        /// <summary>
        /// Emulation Prevention
        /// </summary>
        static byte[] NalStart = { 0x00, 0x00, 0x01 };

        public RFC6184Frame(byte payloadType) : base(payloadType) { }

        public RFC6184Frame(Rtp.RtpFrame existing) : base(existing) { }

        public RFC6184Frame(RFC6184Frame f) : this((Rtp.RtpFrame)f) { Buffer = f.Buffer; }

        public System.IO.MemoryStream Buffer { get; set; }

        /// <summary>
        /// Creates any <see cref="Rtp.RtpPacket"/>'s required for the given nal
        /// </summary>
        /// <param name="nal">The nal</param>
        /// <param name="mtu">The mtu</param>
        public virtual void Packetize(byte[] nal, int mtu = 1500)
        {
            if (nal == null) return;

            int nalLength = nal.Length;

            int offset = 0;

            if (nalLength >= mtu)
            {
                //Make a Fragment Indicator with start bit
                byte[] FUI = new byte[] { (byte)(1 << 7), 0x00 };

                bool marker = false;

                while (offset < nalLength)
                {
                    //Set the end bit if no more data remains
                    if (offset + mtu > nalLength)
                    {
                        FUI[0] |= (byte)(1 << 6);
                        marker = true;
                    }
                    else if (offset > 0) //For packets other than the start
                    {
                        //No Start, No End
                        FUI[0] = 0;
                    }

                    //Add the packet
                    Add(new Rtp.RtpPacket(2, false, false, marker, PayloadTypeByte, 0, SynchronizationSourceIdentifier, HighestSequenceNumber + 1, 0, FUI.Concat(nal.Skip(offset).Take(mtu)).ToArray()));

                    //Move the offset
                    offset += mtu;
                }
            } //Should check for first byte to be 1 - 23?
            else Add(new Rtp.RtpPacket(2, false, false, true, PayloadTypeByte, 0, SynchronizationSourceIdentifier, HighestSequenceNumber + 1, 0, nal));
        }

        /// <summary>
        /// Creates <see cref="Buffer"/> with a H.264 RBSP from the contained packets
        /// </summary>
        public virtual void Depacketize() { bool sps, pps, sei, slice, idr; Depacketize(out sps, out pps, out sei, out slice, out idr); }

        /// <summary>
        /// Parses all contained packets and writes any contained Nal Units in the RBSP to <see cref="Buffer"/>.
        /// </summary>
        /// <param name="containsSps">Indicates if a Sequence Parameter Set was found</param>
        /// <param name="containsPps">Indicates if a Picture Parameter Set was found</param>
        /// <param name="containsSei">Indicates if Supplementatal Encoder Information was found</param>
        /// <param name="containsSlice">Indicates if a Slice was found</param>
        /// <param name="isIdr">Indicates if a IDR Slice was found</param>
        public virtual void Depacketize(out bool containsSps, out bool containsPps, out bool containsSei, out bool containsSlice, out bool isIdr)
        {
            containsSps = containsPps = containsSei = containsSlice = isIdr = false;

            DisposeBuffer();

            this.Buffer = new MemoryStream();

            //Get all packets in the frame
            foreach (Rtp.RtpPacket packet in m_Packets.Values.Distinct()) 
                ProcessPacket(packet, out containsSps, out containsPps, out containsSei, out containsSlice, out isIdr);

            //Order by DON?
            this.Buffer.Position = 0;
        }

        /// <summary>
        /// Depacketizes a single packet.
        /// </summary>
        /// <param name="packet"></param>
        /// <param name="containsSps"></param>
        /// <param name="containsPps"></param>
        /// <param name="containsSei"></param>
        /// <param name="containsSlice"></param>
        /// <param name="isIdr"></param>
        internal protected virtual void ProcessPacket(Rtp.RtpPacket packet, out bool containsSps, out bool containsPps, out bool containsSei, out bool containsSlice, out bool isIdr)
        {
            containsSps = containsPps = containsSei = containsSlice = isIdr = false;

            //Starting at offset 0
            int offset = 0;

            //Obtain the data of the packet (without source list or padding)
            byte[] packetData = packet.Coefficients.ToArray();

            //Cache the length
            int count = packetData.Length;

            //Must have at least 2 bytes
            if (count <= 2) return;

            //Determine if the forbidden bit is set and the type of nal from the first byte
            byte firstByte = packetData[offset];

            //bool forbiddenZeroBit = ((firstByte & 0x80) >> 7) != 0;

            byte nalUnitType = (byte)(firstByte & Common.Binary.FiveBitMaxValue);

            //o  The F bit MUST be cleared if all F bits of the aggregated NAL units are zero; otherwise, it MUST be set.
            //if (forbiddenZeroBit && nalUnitType <= 23 && nalUnitType > 29) throw new InvalidOperationException("Forbidden Zero Bit is Set.");

            //Determine what to do
            switch (nalUnitType)
            {
                //Reserved - Ignore
                case 0:
                case 30:
                case 31:
                    {
                        return;
                    }
                case 24: //STAP - A
                case 25: //STAP - B
                case 26: //MTAP - 16
                case 27: //MTAP - 24
                    {
                        //Move to Nal Data
                        ++offset;

                        //Todo Determine if need to Order by DON first.
                        //EAT DON for ALL BUT STAP - A
                        if (nalUnitType != 24) offset += 2;

                        //Consume the rest of the data from the packet
                        while (offset < count)
                        {
                            //Determine the nal unit size which does not include the nal header
                            int tmp_nal_size = Common.Binary.Read16(packetData, offset, BitConverter.IsLittleEndian);
                            offset += 2;

                            //If the nal had data then write it
                            if (tmp_nal_size > 0)
                            {
                                //For DOND and TSOFFSET
                                switch (nalUnitType)
                                {
                                    case 25:// MTAP - 16
                                        {
                                            //SKIP DOND and TSOFFSET
                                            offset += 3;
                                            goto default;
                                        }
                                    case 26:// MTAP - 24
                                        {
                                            //SKIP DOND and TSOFFSET
                                            offset += 4;
                                            goto default;
                                        }
                                    default:
                                        {
                                            //Read the nal header but don't move the offset
                                            byte nalHeader = (byte)(packetData[offset] & Common.Binary.FiveBitMaxValue);

                                            if (nalHeader > 5)
                                            {
                                                if (nalHeader == 6)
                                                {
                                                    Buffer.WriteByte(0);
                                                    containsSei = true;
                                                }
                                                else if (nalHeader == 7)
                                                {
                                                    Buffer.WriteByte(0);
                                                    containsPps = true;
                                                }
                                                else if (nalHeader == 8)
                                                {
                                                    Buffer.WriteByte(0);
                                                    containsSps = true;
                                                }
                                            }

                                            if (nalHeader == 1) containsSlice = true;

                                            if (nalHeader == 5) isIdr = true;

                                            //Done reading
                                            break;
                                        }
                                }

                                //Write the start code
                                Buffer.Write(NalStart, 0, 3);

                                //Write the nal header and data
                                Buffer.Write(packetData, offset, tmp_nal_size);

                                //Move the offset past the nal
                                offset += tmp_nal_size;
                            }
                        }

                        return;
                    }
                case 28: //FU - A
                case 29: //FU - B
                    {
                        /*
                         Informative note: When an FU-A occurs in interleaved mode, it
                         always follows an FU-B, which sets its DON.
                         * Informative note: If a transmitter wants to encapsulate a single
                          NAL unit per packet and transmit packets out of their decoding
                          order, STAP-B packet type can be used.
                         */
                        //Need 2 bytes
                        if (count > 2)
                        {
                            //Read the Header
                            byte FUHeader = packetData[++offset];

                            bool Start = ((FUHeader & 0x80) >> 7) > 0;

                            //bool End = ((FUHeader & 0x40) >> 6) > 0;

                            //bool Receiver = (FUHeader & 0x20) != 0;

                            //if (Receiver) throw new InvalidOperationException("Receiver Bit Set");

                            //Move to data
                            ++offset;

                            //Todo Determine if need to Order by DON first.
                            //DON Present in FU - B
                            if (nalUnitType == 29) offset += 2;

                            //Determine the fragment size
                            int fragment_size = count - offset;

                            //If the size was valid
                            if (fragment_size > 0)
                            {
                                //If the start bit was set
                                if (Start)
                                {
                                    //Reconstruct the nal header
                                    //Use the first 3 bits of the first byte and last 5 bites of the FU Header
                                    byte nalHeader = (byte)((firstByte & 0xE0) | (FUHeader & Common.Binary.FiveBitMaxValue));

                                    //Could have been SPS / PPS / SEI
                                    if (nalHeader > 5)
                                    {
                                        if (nalHeader == 6)
                                        {
                                            Buffer.WriteByte(0);
                                            containsSei = true;
                                        }
                                        else if (nalHeader == 7)
                                        {
                                            Buffer.WriteByte(0);
                                            containsPps = true;
                                        }
                                        else if (nalHeader == 8)
                                        {
                                            Buffer.WriteByte(0);
                                            containsSps = true;
                                        }
                                    }

                                    if (nalHeader == 1) containsSlice = true;

                                    if (nalHeader == 5) isIdr = true;

                                    //Write the start code
                                    Buffer.Write(NalStart, 0, 3);

                                    //Write the re-construced header
                                    Buffer.WriteByte(nalHeader);
                                }

                                //Write the data of the fragment.
                                Buffer.Write(packetData, offset, fragment_size);
                            }
                        }
                        return;
                    }
                default:
                    {
                        // 6 SEI, 7 and 8 are SPS and PPS
                        if (nalUnitType > 5)
                        {
                            if (nalUnitType == 6)
                            {
                                Buffer.WriteByte(0);
                                containsSei = true;
                            }
                            else if (nalUnitType == 7)
                            {
                                Buffer.WriteByte(0);
                                containsPps = true;
                            }
                            else if (nalUnitType == 8)
                            {
                                Buffer.WriteByte(0);
                                containsSps = true;
                            }
                        }

                        if (nalUnitType == 1) containsSlice = true;

                        if (nalUnitType == 5) isIdr = true;

                        //Write the start code
                        Buffer.Write(NalStart, 0, 3);

                        //Write the nal heaer and data data
                        Buffer.Write(packetData, offset, count - offset);

                        return;
                    }
            }
        }

        internal void DisposeBuffer()
        {
            if (Buffer != null)
            {
                Buffer.Dispose();
                Buffer = null;
            }
        }

        public override void Dispose()
        {
            if (Disposed) return;
            base.Dispose();
            DisposeBuffer();
        }

        //To go to an Image...
        //Look for a SliceHeader in the Buffer
        //Decode Macroblocks in Slice
        //Convert Yuv to Rgb
    }

還有各種其他RFC的實現，它們有助於使媒體在MediaElement或其他軟件中播放，或者只是將其保存到磁盤。

寫入容器格式正在進行中。

Answer 3

我不知道你的其他實現，但你收到的'片段'似乎很可能是NAL單位。 因此，在將比特流發送到ffmpeg之前重建比特流時，每個都可能需要附加的NALU起始碼（ 00 00 01或00 00 00 01 ）。

無論如何，您可能會發現H264 RTP打包的RFC非常有用：

http://www.rfc-editor.org/rfc/rfc3984.txt

希望這可以幫助！

使用ffmpeg（libavcodec）通過RTP解碼H264視頻的問題

問題描述

在里面：

編碼切片：

SPS和PPS：

NAL_IDR_SLICE：

3 個解決方案

解決方案1
25 已采納 2010-08-17 11:38:09

解決方案2
1 2014-11-14 19:01:42

解決方案3
1 2010-08-16 17:21:23

使用ffmpeg（libavcodec）通過RTP解碼H264視頻的問題

問題描述

在里面：

編碼切片：

SPS和PPS：

NAL_IDR_SLICE：

3 個解決方案

解決方案1 25 已采納 2010-08-17 11:38:09

解決方案2 1 2014-11-14 19:01:42

解決方案3 1 2010-08-16 17:21:23

解決方案1
25 已采納 2010-08-17 11:38:09

解決方案2
1 2014-11-14 19:01:42

解決方案3
1 2010-08-16 17:21:23