简体   繁体   English

如何将音频帧从 input.mp4 传递到 libavcodec 中的 output.mp4?

[英]How can I pass audio frames from an input .mp4 to an output .mp4 in libavcodec?

I have a project that correctly opens a.mp4, extracts video frames, modifies them, and dumps the modified frames to an output.mp4.我有一个项目可以正确打开 a.mp4,提取视频帧,修改它们,并将修改后的帧转储到 output.mp4。 Everything works (mostly - I have a video timing bug that pops up at random, but I'll kill it) EXCEPT for the writing of audio.一切正常(主要是 - 我有一个随机弹出的视频计时错误,但我会杀了它)除了音频的编写。 I don't want to modify the audio channel at all - I just want the audio from the input.mp4 to be passed, unaltered, to the output.mp4.我根本不想修改音频通道 - 我只想将 input.mp4 中的音频原封不动地传递给 output.mp4。

There's too much code here to provide a working example, largely because there's a lot of OpenGL and GLSL in there, but the most important part is where I advance a frame.这里有太多代码无法提供一个工作示例,主要是因为那里有很多 OpenGL 和 GLSL,但最重要的部分是我推进框架的地方。 This method is called in a loop, and if the frame was a video frame, the loop sends the image data to the rendering hardware, does a bunch of GL magic on it, then writes out a frame of video.该方法在循环中调用,如果该帧是视频帧,则循环将图像数据发送到渲染硬件,对其执行一堆 GL 魔术,然后写出一帧视频。 If the frame was an audio frame, the loop does nothing, but the advance_frame() method is supposed to just dump that frame to the output mp4.如果帧是音频帧,则循环不执行任何操作,但advance_frame()方法应该只是将该帧转储到 output mp4。 I'm at a loss as to what libavcodec provides that will do this.我不知道 libavcodec 提供了什么来做到这一点。

Note that here, I'm decoding the audio packets into frames, but that shouldn't be necessary.请注意,在这里,我将音频数据包解码为帧,但这不是必需的。 I'd rather work with packets and not burn the CPU time to do the decode at all.我宁愿使用数据包而不是消耗 CPU 时间来进行解码。 (I've tried it the other way, but this is what I wound up with when I tried to decode the data, then re-encode to create the output stream.) I just need a way to pass the packets from the input to the output. (我已经尝试过另一种方式,但这就是我尝试解码数据时的结果,然后重新编码以创建 output stream。)我只需要一种方法将数据包从输入传递到output。

bool MediaContainerMgr::advance_frame() {
    int ret; // Crappy naming, but I'm using ffmpeg's name for it.
    while (true) {
        ret = av_read_frame(m_format_context, m_packet);
        if (ret < 0) {
            // Do we actually need to unref the packet if it failed?
            av_packet_unref(m_packet);
            if (ret == AVERROR_EOF) {
                finalize_output();
                return false;
            }
            continue;
            //return false;
        }
        else {
            int response = decode_packet();
            if (response != 0) {
                continue;
            }
            // If this was an audio packet, the image renderer doesn't care about it - just push
            // the audio data to the output .mp4:
            if (m_packet->stream_index == m_audio_stream_index) {
                printf("m_packet->stream_index: %d\n", m_packet->stream_index);
                printf("  m_packet->pts: %lld\n", m_packet->pts);
                printf("  mpacket->size: %d\n", m_packet->size);
                // m_recording is true if we're writing a .mp4, as opposed to just letting OpenGL
                // display the frames onscreen.
                if (m_recording) {
                    int err = 0;
                    // I've tried everything I can find to try to push the audio frame to the
                    // output .mp4. This doesn't work, but neither do a half-dozen other libavcodec
                    // methods:
                    err = avcodec_send_frame(m_output_audio_codec_context, m_last_audio_frame);

                    if (err) {
                        printf("  encoding error: %d\n", err);
                    }
                }
            }
            av_packet_unref(m_packet);
            if (m_packet->stream_index == m_video_stream_index) {
                return true;
            }
        }
    }
}

The workhorse of advance_frame() is decode_packet() . advance_frame()的主力是decode_packet() All of this works perfectly for video data:所有这些都适用于视频数据:

int MediaContainerMgr::decode_packet() {
    // Supply raw packet data as input to a decoder
    // https://ffmpeg.org/doxygen/trunk/group__lavc__decoding.html#ga58bc4bf1e0ac59e27362597e467efff3
    int             response;
    AVCodecContext* codec_context = nullptr;
    AVFrame*        frame         = nullptr;

    if (m_packet->stream_index == m_video_stream_index) {
        codec_context = m_video_input_codec_context;
        frame = m_last_video_frame;
    }
    if (m_packet->stream_index == m_audio_stream_index) {
        codec_context = m_audio_input_codec_context;
        frame = m_last_audio_frame;
    }

    if (codec_context == nullptr) {
        return -1;
    }

    response = avcodec_send_packet(codec_context, m_packet);
    if (response < 0) {
        char buf[256];
        av_strerror(response, buf, 256);
        printf("Error while receiving a frame from the decoder: %s\n", buf);
        return response;
    }

    // Return decoded output data (into a frame) from a decoder
    // https://ffmpeg.org/doxygen/trunk/group__lavc__decoding.html#ga11e6542c4e66d3028668788a1a74217c
    response = avcodec_receive_frame(codec_context, frame);
    if (response == AVERROR(EAGAIN) || response == AVERROR_EOF) {
        return response;
    } else if (response < 0) {
        char buf[256];
        av_strerror(response, buf, 256);
        printf("Error while receiving a frame from the decoder: %s\n", buf);
        return response;
    } else {
        printf(
            "Stream %d, Frame %d (type=%c, size=%d bytes), pts %lld, key_frame %d, [DTS %d]\n",
            m_packet->stream_index,
            codec_context->frame_number,
            av_get_picture_type_char(frame->pict_type),
            frame->pkt_size,
            frame->pts,
            frame->key_frame,
            frame->coded_picture_number
        );
    }
    return 0;
}

I can provide the setup for all the contexts, if necessary, but for brevity maybe we can get away with what av_dump_format(m_output_format_context, 0, filename, 1) displays:如有必要,我可以为所有上下文提供设置,但为简洁起见,也许我们可以摆脱av_dump_format(m_output_format_context, 0, filename, 1)显示的内容:

Output #0, mp4, to 'D:\yodeling_monkey_nuggets.mp4':
  Metadata:
    encoder         : Lavf58.64.100
    Stream #0:0: Video: h264 (libx264) (avc1 / 0x31637661), yuv420p, 1920x1080, q=-1--1, 20305 kb/s, 29.97 fps, 30k tbn
    Stream #0:1: Audio: aac (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 125 kb/s

To put audio AVPacket "as-is" to the output without decode-encode steps you should use av_write_frame function for such packets instead of avcodec_send_frame .要将音频 AVPacket “按原样”放入 output 而无需解码 - 编码步骤,您应该对此类数据包使用av_write_frame function 而不是avcodec_send_frame Note that these functions use different contexts: AVFormatContext and AVCodecContext .请注意,这些函数使用不同的上下文: AVFormatContextAVCodecContext

avcodec_send_frame supplies a raw video or audio frame to the encoder avcodec_send_frame向编码器提供原始视频或音频帧

av_write_frame passes the packet directly to the muxer av_write_frame将数据包直接传递给复用器

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM