简体   繁体   中英

RTMP Broadcast packet body structure for Twitch

I'm currently working on a project similar to OBS, where I'm capturing screen data, encoding it with the x264 library, and then broadcasting it to a twitch server.

Currently, the servers are accepting the data, but no video is being played - it buffers for a moment, then returns an error code "2000: network error"

Like OBS Classic, I'm dividing each NAL provided by x264 by its type, and then making changes to each

int frame_size = x264_encoder_encode(encoder, &nals, &num_nals, &pic_in, &pic_out);

    //sort the NAL's into their types and make necessary adjustments

    int timeOffset = int(pic_out.i_pts - pic_out.i_dts);

    timeOffset = htonl(timeOffset);//host to network translation, ensure the bytes are in the right format
    BYTE *timeOffsetAddr = ((BYTE*)&timeOffset) + 1;

    videoSection sect;
    bool foundFrame = false;

    uint8_t * spsPayload = NULL;
    int spsSize = 0;

    for (int i = 0; i<num_nals; i++) {
        //std::cout << "VideoEncoder: EncodedImages Size: " << encodedImages->size() << std::endl;
        x264_nal_t &nal = nals[i];
        //std::cout << "NAL is:" << nal.i_type << std::endl;

        //need to account for pps/sps, seems to always be the first frame sent
        if (nal.i_type == NAL_SPS) {
            spsSize = nal.i_payload;
            spsPayload = (uint8_t*)malloc(spsSize);
            memcpy(spsPayload, nal.p_payload, spsSize);
        } else if (nal.i_type == NAL_PPS){
            //pps always happens after sps
            if (spsPayload == NULL) {
                std::cout << "VideoEncoder: critical error, sps not set" << std::endl;
            }
            uint8_t * payload = (uint8_t*)malloc(nal.i_payload + spsSize);
            memcpy(payload, spsPayload, spsSize);
            memcpy(payload, nal.p_payload + spsSize, nal.i_payload);
            sect = { nal.i_payload + spsSize, payload, nal.i_type };
            encodedImages->push(sect);
        } else if (nal.i_type == NAL_SEI || nal.i_type == NAL_FILLER) { 
            //these need some bytes at the start removed
            BYTE *skip = nal.p_payload;
            while (*(skip++) != 0x1);
            int skipBytes = (int)(skip - nal.p_payload);

            int newPayloadSize = (nal.i_payload - skipBytes);

            uint8_t * payload = (uint8_t*)malloc(newPayloadSize);
            memcpy(payload, nal.p_payload + skipBytes, newPayloadSize);
            sect = { newPayloadSize, payload, nal.i_type };
            encodedImages->push(sect);

        } else if (nal.i_type == NAL_SLICE_IDR || nal.i_type == NAL_SLICE) { 
            //these packets need an additional section at the start
            BYTE *skip = nal.p_payload;
            while (*(skip++) != 0x1);
            int skipBytes = (int)(skip - nal.p_payload);

            std::vector<BYTE> bodyData;
            if (!foundFrame) {
                if (nal.i_type == NAL_SLICE_IDR) { bodyData.push_back(0x17); } else { bodyData.push_back(0x27); } //add a 17 or a 27 as appropriate
                bodyData.push_back(1);
                bodyData.push_back(*timeOffsetAddr);

                foundFrame = true;
            }

            //put into the payload the bodyData followed by the nal payload
            uint8_t * bodyDataPayload = (uint8_t*)malloc(bodyData.size());
            memcpy(bodyDataPayload, bodyData.data(), bodyData.size() * sizeof(BYTE));

            int newPayloadSize = (nal.i_payload - skipBytes);

            uint8_t * payload = (uint8_t*)malloc(newPayloadSize + sizeof(bodyDataPayload));
            memcpy(payload, bodyDataPayload, sizeof(bodyDataPayload));
            memcpy(payload + sizeof(bodyDataPayload), nal.p_payload + skipBytes, newPayloadSize);
            int totalSize = newPayloadSize + sizeof(bodyDataPayload);
            sect = { totalSize, payload, nal.i_type };
            encodedImages->push(sect);
        } else {
            std::cout << "VideoEncoder: Nal type did not match expected" << std::endl;
            continue;
        }
    }

The NAL payload data is then put into a struct, VideoSection, in a queue buffer

//used to transfer encoded data
struct videoSection {
    int frameSize;
    uint8_t* payload;
    int type;
};

After which it is picked up by the broadcaster, a few more changes are made, and then I call rtmp_send()

videoSection sect = encodedImages->front();
encodedImages->pop();

//std::cout << "Broadcaster: Frame Size: " << sect.frameSize << std::endl;

//two methods of sending RTMP data, _sendpacket and _write. Using sendpacket for greater control

RTMPPacket * packet;

unsigned char* buf = (unsigned char*)sect.payload;

int type = buf[0]&0x1f; //I believe &0x1f sets a 32bit limit
int len = sect.frameSize;
long timeOffset = GetTickCount() - rtmp_start_time;

//assign space packet will need
packet = (RTMPPacket *)malloc(sizeof(RTMPPacket)+RTMP_MAX_HEADER_SIZE + len + 9);
memset(packet, 0, sizeof(RTMPPacket) + RTMP_MAX_HEADER_SIZE);

packet->m_body = (char *)packet + sizeof(RTMPPacket) + RTMP_MAX_HEADER_SIZE;
packet->m_nBodySize = len + 9;

//std::cout << "Broadcaster: Packet Size: " << sizeof(RTMPPacket) + RTMP_MAX_HEADER_SIZE + len + 9 << std::endl;
//std::cout << "Broadcaster: Packet Body Size: " << len + 9 << std::endl;

//set body to point to the packetbody
unsigned char *body = (unsigned char *)packet->m_body;
memset(body, 0, len + 9);



//NAL_SLICE_IDR represents keyframe
//first element determines packet type
body[0] = 0x27;//inter-frame h.264
if (sect.type == NAL_SLICE_IDR) {
    body[0] = 0x17; //h.264 codec id
}


//-------------------------------------------------------------------------------
//this section taken from https://stackoverflow.com/questions/25031759/using-x264-and-librtmp-to-send-live-camera-frame-but-the-flash-cant-show
//in an effort to understand packet format. it does not resolve my previous issues formatting the data for twitch to play it

//sets body to be NAL unit
body[1] = 0x01;
body[2] = 0x00;
body[3] = 0x00;
body[4] = 0x00;

//>> is a shift right
//shift len to the right, and AND it
/*body[5] = (len >> 24) & 0xff;
body[6] = (len >> 16) & 0xff;
body[7] = (len >> 8) & 0xff;
body[8] = (len) & 0xff;*/

//end code sourced from https://stackoverflow.com/questions/25031759/using-x264-and-librtmp-to-send-live-camera-frame-but-the-flash-cant-show
//-------------------------------------------------------------------------------

//copy from buffer into rest of body
memcpy(&body[9], buf, len);

//DEBUG

//save individual packet body to a file with name rtmp[packetnum]
//determine why some packets do not have 0x27 or 0x17 at the start
//still happening, makes no sense given the above code

/*std::string fileLocation = "rtmp" + std::to_string(packCount++);
std::cout << fileLocation << std::endl;
const char * charConversion = fileLocation.c_str();

FILE* saveFile = NULL;
saveFile = fopen(charConversion, "w+b");//open as write and binary
if (!fwrite(body, len + 9, 1, saveFile)) {
    std::cout << "VideoEncoder: Error while trying to write to file" << std::endl;
}
fclose(saveFile);*/

//END DEBUG

//other packet details
packet->m_hasAbsTimestamp = 0;
packet->m_packetType = RTMP_PACKET_TYPE_VIDEO;
if (rtmp != NULL) {
    packet->m_nInfoField2 = rtmp->m_stream_id;
}
packet->m_nChannel = 0x04;
packet->m_headerType = RTMP_PACKET_SIZE_LARGE;
packet->m_nTimeStamp = timeOffset;

//send the packet
if (rtmp != NULL) {
    RTMP_SendPacket(rtmp, packet, TRUE);
}

I can see that Twitch is receiving the data in the inspector, at a steady 3kbps. so I'm sure something is wrong with how I'm adjusting the data before sending it. Can anyone advise me on what I'm doing wrong here?

The problems start before the code you included even. When you configure x264 be sure to set:

b_aud = 0;
b_repeat_headers = 0;
b_annexb = 0;

This will tell x264 to generate the format needed by rtmp, Then you can skip all the per-nal preprocessing.

For sps/pps use x264_encoder_headers to retrieve them after x264_encoder_open . Encode them into an "extradata" buffer as documented here Possible Locations for Sequence/Picture Parameter Set(s) for H.264 Stream . This extradata goes into an rtmp "sequence header" packet before any frames are sent. Set the frame the AVCPacketType accordingly body[1] in your case, 0 for sequence header 1 for everything else,

body[0] = 0x27;
body[1] = 0;
body[2] = 0;
body[3] = 0;
body[4] = 0;
memcpy(&body[5], extradata, extradata_size);

body[2] through body[4] MUST be set to the frame cts ( pts - dts ) if you have b frames. If you want to set it to zero, configure x264 for baseline profile, but this will result in reduced image quality. Use the return code from x264_encoder_encode as the frame size, and write the whole frame in one go.

int frame_size = x264_encoder_encode(encoder, &nals, &num_nals, &pic_in, &pic_out);
if(frame_size) {
    int cts = pic_out->i_pts - pic_out->i_dts;
    body[0] = pic_out->b_keyframe ? 0x27 : 0x17;
    body[1] = 1;
    body[2] = cts>>16;
    body[3] = cts>>8;
    body[4] = cts;
    memcpy(&body[5], nals->p_payload, frame_size);
}

Finally, Twitch requires you also send an AAC audio stream. and be sure to set the keyframe interval to 2 seconds.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM