简体   繁体   English

FFMpeg 库:如何在音频文件中精确查找

[英]FFMpeg library: how to precisely seek in an audio file

Using the FFMpeg library in my Android app, I try to understand how I can seek in an audio file, at a very precise position.在我的 Android 应用程序中使用 FFMpeg 库,我尝试了解如何在一个非常精确的位置查找音频文件。

For example, I want to set the current position in my file to the frame #1234567 (in a file encoded at 44100 Hz), which is equivalent to seek at 27994.717 milliseconds.例如,我想将文件中的当前位置设置为帧 #1234567(在以 44100 Hz 编码的文件中),这相当于在 27994.717 毫秒处查找。

To achieve that, here is what I tried:为了实现这一点,这是我尝试过的:

// this:
av_seek_frame(formatContext, -1, 27994717, 0);

// or this:
av_seek_frame(formatContext, -1, 27994717, AVSEEK_FLAG_ANY);

// or even this:
avformat_seek_file(formatContext, -1, 27994617, 27994717, 27994817, 0);

Using a position in microseconds gives me the best result so far.到目前为止,使用以微秒为单位的位置给了我最好的结果。

But for some reason, the positioning is not totally accurate: when I extract the samples from the audio file, it doesn't start exactly at the expected position.但由于某种原因,定位并不完全准确:当我从音频文件中提取样本时,它并没有完全从预期位置开始。 There is a slight delay of about 30-40 milliseconds (even if I seek to the position 0, surprisingly...).有大约 30-40 毫秒的轻微延迟(即使我寻找位置 0,令人惊讶的是......)。

Do I use the function the right way, or even the right function?我是否以正确的方式使用该功能,甚至是正确的功能?

EDIT编辑

Here is how I can get the position:这是我获得职位的方法:

AVPacket packet;
AVStream *stream = NULL;
AVFormatContext *formatContext = NULL;
AVCodec *dec = NULL;

// initialization:
avformat_open_input(&formatContext, filename, NULL, NULL);
avformat_find_stream_info(formatContext, NULL);
int audio_stream_index = av_find_best_stream(formatContext, AVMEDIA_TYPE_AUDIO, -1, -1, &dec, 0);
stream = formatContext->streams[audio_stream_index];

...

// later, when I extract samples, here is how I get my position, in microseconds:
av_read_frame(formatContext, &packet);
long position = (long) (1000000 * (packet.pts * ((float) stream->time_base.num / stream->time_base.den)));

Thanks to that piece of code, I can get the position of the beginning of the current frame (frame = bloc of samples, the size depends on the audio format - 1152 samples for mp3, 128 to 1152 for ogg, ...)多亏了这段代码,我可以获得当前帧开始的位置(帧 = 样本块,大小取决于音频格式 - mp3 为 1152 个样本,ogg 为 128 到 1152,...)

The problem is: the value I get in position is not accurate: it's actually 30 ms late, approximately.问题是:我获得的价值position是不准确的:它实际上是30毫秒晚,大约。 For example, when it says 1000000, the actual position is approximately 1030000...例如,当它说 1000000 时,实际位置大约是 1030000...

What did I do wrong?我做错了什么? Is it a bug in FFMpeg?这是 FFMpeg 中的错误吗?

Thanks for your help.谢谢你的帮助。

It depends on the codec.这取决于编解码器。 For example aac has a resolution of 1024 samples per frame, no matter what the sample rate, it also has priming samples that may be discarded.例如aac的分辨率为每帧1024个样本,无论采样率如何,它也有可能被丢弃的启动样本。 MP3 has 576 or 1152 samples per frame depending on the layer. MP3 每帧有 576 或 1152 个样本,具体取决于层。

If you need perfection, use an uncompressed format like wav or riff.如果您需要完美,请使用未压缩的格式,例如 wav 或 riff。

Late, but hopefully, it helps someone.迟到了,但希望对某人有所帮助。 The idea is to save timestamp when seeking and then compare AVPacket->pts with this value (You can do that with AVStream->dts , but it wasn't giving good results in my experiments).这个想法是在寻找时保存时间戳,然后将AVPacket->pts与这个值进行比较(你可以用AVStream->dts来做到这一点,但在我的实验中它没有给出好的结果)。 If pts is still lower than our target timestamp, then skip frames using AV_PKT_DATA_SKIP_SAMPLES ability of AVPacket->side_data .如果pts仍然低于我们的目标时间戳,则使用AVPacket->side_data 的AV_PKT_DATA_SKIP_SAMPLES能力跳过帧。

Code for seeking method:寻找方法的代码:

void audio_decoder::seek(float seconds) {
    auto stream = m_format_ctx->streams[m_packet->stream_index];

    // convert seconds provided by the user to a timestamp in a correct base,
    // then save it for later.
    m_target_ts = av_rescale_q(seconds * AV_TIME_BASE, AV_TIME_BASE_Q, stream->time_base);

    avcodec_flush_buffers(m_codec_ctx.get());

    // Here we seek within given stream index and the correct timestamp 
    // for that stream. Using AVSEEK_FLAG_BACKWARD to make sure we're 
    // always *before* requested timestamp.
    if(int err = av_seek_frame(m_format_ctx.get(), m_packet->stream_index, m_target_ts, AVSEEK_FLAG_BACKWARD)) {
        error("audio_decoder: Error while seeking ({})", av_err_str(err));
    }
}

And code for decoding method:以及解码方法的代码:

void audio_decoder::decode() {
   <...>

   while(is_decoding) {
       // Read data as usual.
       av_read_frame(m_format_ctx.get(), m_packet.get());

       // Here is the juicy part. We were seeking, but the seek 
       // wasn't precise enough so we need to drop some frames.
       if(m_packet->pts > 0 && m_target_ts > 0 && m_packet->pts < m_target_ts) {
            auto stream = m_format_ctx->streams[m_packet->stream_index];

            // Conversion from delta timestamp to frames.
            auto time_delta = static_cast<float>(m_target_ts - m_packet->pts) / stream->time_base.den;
            int64_t skip_frames = time_delta * m_codec_ctx->time_base.den / m_codec_ctx->time_base.num;

            // Next step: we need to provide side data to our packet,
            // and it will tell the codec to drop frames.
            uint8_t *data = av_packet_get_side_data(m_packet.get(), AV_PKT_DATA_SKIP_SAMPLES, nullptr);
            if(!data) {
                 data = av_packet_new_side_data(m_packet.get(), AV_PKT_DATA_SKIP_SAMPLES, 10);
            }

            // Define parameters of side data. You can check them here:
            // https://ffmpeg.org/doxygen/trunk/group__lavc__packet.html#ga9a80bfcacc586b483a973272800edb97
            *reinterpret_cast<uint32_t*>(data) = skip_frames;
            data[8] = 0;
        }

        // Send packet as usual.
        avcodec_send_packet(m_codec_ctx.get(), m_packet.get());

        // Proceed to the receiving frames as usual, nothing to change there.
   }
   <...>
}

If it's unclear without context, you can check the same code in my project audio_decoder.cpp .如果没有上下文不清楚,您可以在我的项目audio_decoder.cpp 中检查相同的代码。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM