简体   繁体   English

FFMPEG API-录制视频和音频-同步问题

[英]FFMPEG API - Recording video and audio - Syncing problems

I'm developing an app which is able to record video from a webcam and audio from a microphone. 我正在开发一个应用程序,它可以录制网络摄像头中的视频和麦克风中的音频。 I've been using QT but unfortunately the camera module does not work on windows which led me to use ffmpeg to record the video/audio. 我一直在使用QT,但不幸的是,摄像头模块在Windows上无法正常工作,这导致我使用ffmpeg录制视频/音频。

My Camera module is now working well besides a slight problem with syncing. 除了同步方面的轻微问题,我的相机模块现在运行良好。 The audio and video sometimes end up out of sync by a small difference (less than 1 second I'd say, although it might be worse with longer recordings). 音频和视频有时最终会以很小的差异不同步(我说少于1秒,尽管较长的录音可能会更糟)。

When I encode the frames I add the PTS in the following way (which I took from the muxing.c example): 在对帧进行编码时,我以以下方式添加PTS(我来自muxing.c示例):

  • For the video frames I increment the PTS one by one (starting at 0). 对于视频帧,我将PTS逐一递增(从0开始)。
  • For the audio frames I increment the PTS by the nb_samples of the audio frame (starting at 0). 对于音频帧,我将PTS递增音频帧的nb_samples (从0开始)。

I am saving the file at 25 fps and asking for the camera to give me 25 fps (which it can). 我将文件保存为25 fps,并要求相机给我25 fps(可以)。 I am also converting the video frames to the YUV420P format. 我还将视频帧转换为YUV420P格式。 For the audio frames conversion I need to use a AVAudioFifo because the microfone sends bigger samples than the mp4 stream supports, so I have to split them in chuncks. 对于音频帧转换,我需要使用AVAudioFifo因为微音发送的样本比mp4流支持的样本大,因此我必须将其拆分为小块。 I used the transcode.c example for this. 我为此使用了transcode.c示例。

I am out of ideas in what I should do to sync the audio and video. 我在同步音频和视频时应采取的措施没有主意。 Do I need to use a clock or something to correctly sync up both streams? 我需要使用时钟或其他工具来正确同步两个流吗?

The full code is too big to post here but should it be necessary I can add it to github for example. 完整的代码太大了,无法在此处发布,但是如果有必要,我可以将其添加到github中。

Here is the code for writing a frame: 这是编写框架的代码:

int FFCapture::writeFrame(const AVRational *time_base, AVStream *stream, AVPacket *pkt) {
    /* rescale output packet timestamp values from codec to stream timebase */
    av_packet_rescale_ts(pkt, *time_base, stream->time_base);
    pkt->stream_index = stream->index;
    /* Write the compressed frame to the media file. */
    return av_interleaved_write_frame(oFormatContext, pkt);
}

Code for getting the elapsed time: 获取经过时间的代码:

qint64 FFCapture::getElapsedTime(qint64 *previousTime) {
    qint64 newTime = timer.elapsed();
    if(newTime > *previousTime) {
        *previousTime = newTime;
        return newTime;
    }
    return -1;
}

Code for adding the PTS (video and audio stream, respectively): 用于添加PTS(分别为视频和音频流)的代码:

qint64 time = getElapsedTime(&previousVideoTime);
if(time >= 0) outFrame->pts = time;
//if(time >= 0) outFrame->pts = av_rescale_q(time, outStream.videoStream->codec->time_base, outStream.videoStream->time_base);

qint64 time = getElapsedTime(&previousAudioTime);
if(time >= 0) {
    AVRational aux;
    aux.num = 1;
    aux.den = 1000;
    outFrame->pts = time;
    //outFrame->pts = av_rescale_q(time, aux, outStream.audioStream->time_base);
}

Sounds like you need to give the frames (audio and video) real timestamps. 听起来您需要给帧(音频和视频)实时时间戳。 Create a function that returns the elapsed time since you started the capture in milliseconds (an integer). 创建一个函数,该函数返回自开始捕获以来所经过的时间(以毫秒为单位)(整数)。 Then set time_base for each stream to {1,1000} and set pts of each frame to the return value of your function. 然后设置time_base每个流{1,1000}并设置pts每一帧到您的函数的返回值。 But be careful: you can't have a timestamp that is <= a previous timestamp. 但请注意:您的时间戳不能小于或等于前一个时间戳。 So you will need to drop frames if you get several all at once (or write another mechanism for dealing with this situation). 因此,如果您一次获得数个帧,则需要丢弃帧(或编写另一种机制来处理这种情况)。

Taken from my longer answer here . 从我不再回答采取这里

Example using QElapsedTimer here . 在这里使用QElapsedTimer示例。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM