简体   繁体   中英

How to put audio data in AVFrame for encode

I try to encode raw PCM sound to G711A and G711U and then decode it, with this codecs everything works fine because I can choose any value for AVCodecContext frame_size for encoding, but in case of Opus codec the AVCodecContext frame_size is equal to 120, so if I understood correctly if my input data array size is bigger than 120 then I need to do some kind of buffering and split my input data into several parts and then sequentially put it to AVFrame->data and pass the AVFrame to encoding.

In result I get a very bad sound and I get this result not only when I use Opus codec but also in G711 if I set it's AVCodecContext frame_size to some value that will be less than size of my input data.

So my question is: what it the correct way to encode input data if it's size if bigger than AVCodecContext frame_size? Do I need to split my input data into some parts that <= AVCodecContext frame_size if so how should I do that?

At this moment my code looks like this:

void encode(uint8_t *data, unsigned int length)
{
    int rawOffset = 0;
    int rawDelta = 0;
    int rawSamplesCount = frameEncode->nb_samples <= length ? frameEncode->nb_samples : length;

    while (rawSamplesCount > 0)
    {
        memcpy(frameEncode->data[0], &data[rawOffset], sizeof(uint8_t) * rawSamplesCount);

        encodeFrame();

        rawOffset += rawSamplesCount;
        rawDelta = length - rawOffset;
        rawSamplesCount = rawDelta > frameEncode->nb_samples ? frameEncode->nb_samples : rawDelta;
    }

    av_frame_unref(frameEncode);
}

void encodeFrame()
{
    /* send the frame for encoding */
    int ret = avcodec_send_frame(contextEncoder, frameEncode);
    if (ret < 0)
    {
        LOGE(TAG, "[encodeFrame] avcodec_send_frame error: %s", av_err2str(ret));
        return;
    }

    /* read all the available output packets (in general there may be any number of them) */
    while (ret >= 0)
    {
        ret = avcodec_receive_packet(contextEncoder, packetEncode);
        if (ret < 0 && ret != AVERROR(EAGAIN)) LOGE(TAG, "[encodeFrame] error in avcodec_receive_packet: %s", av_err2str(ret));
        if (ret < 0) break;
        std::pair<uint8_t*, unsigned int> p = std::pair<uint8_t*, unsigned int>();
        p.first = (uint8_t *)(malloc(sizeof(uint8_t) * packetEncode->size));
        memcpy(p.first, packetEncode->data, (size_t)packetEncode->size);
        p.second = (unsigned int)(packetEncode->size);

        listEncode.push_back(p); // place encoded data into list to finally create one array of encoded data from it
    }
    av_packet_unref(packetEncode);
}

You can see that I split my input data into several parts, then I put it in frame->data and then pass the frame to encoding but I'm not sure that is the correct way.

UPD: I noticed that when I use G711 if I set AVCodecContext frame_size to 160 and size of my input data is 160 or 320 everething works fine, but if input data size is 640 then i get bad buzzing sound.

You said it all, "so if I understood correctly if my input data array size is bigger than 120 then I need to do some kind of buffering and split my input data into several parts and then sequentially put it to AVFrame->data and pass the AVFrame to encoding."

This is what you need. BUffer the samples and send the fixed amount each time for encoding.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM