简体   繁体   中英

FFmpeg: audio synchronization with audioqueue

I have a video player in my aplication. I have no problem with avi files and mp3 audio, but when I play mpg or wmv, and I have to use avcodec_decode_audio3. The first seconds plays and then when the buffer refill, I get a silence for a few seconds and then the audio continues from the same place, this happen each time the buffer refill.

This is the AudioQueue Format:

        playState.format.mSampleRate = _av->audio.sample_rate;
        playState.format.mFormatID = kAudioFormatLinearPCM;
        playState.format.mFormatFlags =  kAudioFormatFlagsCanonical;
        playState.format.mChannelsPerFrame = _av->audio.channels_per_frame;
        playState.format.mBytesPerPacket = sizeof(AudioSampleType) *_av->audio.channels_per_frame;
        playState.format.mBytesPerFrame = sizeof(AudioSampleType) *_av->audio.channels_per_frame;
        playState.format.mBitsPerChannel = 8 * sizeof(AudioSampleType);

        playState.format.mFramesPerPacket = 1;
        playState.format.mReserved = 0;

fillAudioBuffer:

static void fillAudioBuffer(AudioQueueRef queue, AudioQueueBufferRef buffer){

int lengthCopied = INT32_MAX;
int dts= 0;
int isDone = 0;

buffer->mAudioDataByteSize = 0;
buffer->mPacketDescriptionCount = 0;

OSStatus err = 0;
AudioTimeStamp bufferStartTime;

AudioQueueGetCurrentTime(queue, NULL, &bufferStartTime, NULL);

while(buffer->mPacketDescriptionCount < numPacketsToRead && lengthCopied > 0){

    lengthCopied = getNextAudio(_av,buffer->mAudioDataBytesCapacity-buffer->mAudioDataByteSize, (uint8_t*)buffer->mAudioData+buffer->mAudioDataByteSize,&dts,&isDone);
    if(!lengthCopied || isDone) break;

    if(aqStartDts < 0) aqStartDts = dts;
    if(buffer->mPacketDescriptionCount ==0){
        bufferStartTime.mFlags = kAudioTimeStampSampleTimeValid;
        bufferStartTime.mSampleTime = (Float64)(dts-aqStartDts);
    }
    buffer->mPacketDescriptions[buffer->mPacketDescriptionCount].mStartOffset = buffer->mAudioDataByteSize;
    buffer->mPacketDescriptions[buffer->mPacketDescriptionCount].mDataByteSize = lengthCopied;
    buffer->mPacketDescriptions[buffer->mPacketDescriptionCount].mVariableFramesInPacket = _av->audio.frame_size;
    buffer->mPacketDescriptionCount++;
    buffer->mAudioDataByteSize += lengthCopied;

}
if(buffer->mAudioDataByteSize){
    if((err=AudioQueueEnqueueBufferWithParameters(queue, buffer, 0, NULL, 0, 0, 0, NULL, &bufferStartTime, NULL)))
    {

    }
}


int   getNextAudio(video_data_t* vInst, int maxlength, uint8_t* buf, int* pts, int* isDone) {
struct video_context_t  *ctx = vInst->context;
int    datalength            = 0;
while(ctx->audio_ring.lock || ((ctx->audio_ring.count <= 0 && ((ctx->play_state & STATE_DIE) != STATE_DIE))&&((ctx->play_state & STATE_EOF) != STATE_EOF))){
    PMSG1(stdout,"die get audio %d", ctx->play_state);
    if((ctx->play_state & STATE_STOP) != STATE_STOP){
        PMSG1(stdout,"die NO CARGADO %d",ctx->play_state);
        return 0;
    }
    usleep(100);
}
*pts = 0;
ctx->audio_ring.lock = kLocked;

if(ctx->audio_ring.count>0 && maxlength > ctx->audio_buffer[ctx->audio_ring.read].size){
    memcpy(buf, ctx->audio_buffer[ctx->audio_ring.read].data, ctx->audio_buffer[ctx->audio_ring.read].size);
    datalength = ctx->audio_buffer[ctx->audio_ring.read].size;
    *pts = ctx->audio_buffer[ctx->audio_ring.read].pts;
    ctx->audio_ring.read++;
    ctx->audio_ring.read %= ABUF_SIZE;
    ctx->audio_ring.count--;
}
ctx->audio_ring.lock = kUnlocked;

if((ctx->play_state & STATE_EOF) == STATE_EOF && ctx->audio_ring.count == 0) *isDone = 1;
return datalength;

This is a log playing a mpg file:

Input #0, mpeg, '1.MPG':
  Duration: 00:03:14.74, start: 3370.475789, bitrate: 2489 kb/s
    Stream #0:0[0x1e0]: Video: mpeg2video (Main), yuv420p, 544x576 [SAR 24:17 DAR 4:3], 9000 kb/s, 25 fps, 25 tbr, 90k tbn, 50 tbc
    Stream #0:1[0x1c0]: Audio: mp2, 48000 Hz, stereo, s16, 192 kb/s
mpeg2video  MPEG-2 video 
aspect 1.333333
startPlayback
DTS: 0.000000 time base: 0.000011 StartDTS: 303347520 Orig DTS: 303347520
Video Buffer: 157/1024 Audio Buffer: 33/1024
Bytes copied for buffer 0xc292ac0: 1046016
DTS: 490320.000000 time base: 0.000011 StartDTS: 303347520 Orig DTS: 303837840
Video Buffer: 276/1024 Audio Buffer: 2/1024
Bytes copied for buffer 0x1225f8b0: 1046016
DTS: 980640.000000 time base: 0.000011 StartDTS: 303347520 Orig DTS: 304328160
Video Buffer: 411/1024 Audio Buffer: 1/1024
Bytes copied for buffer 0x13380840: 1046016
DTS: 1470960.000000 time base: 0.000011 StartDTS: 303347520 Orig DTS: 304818480
Video Buffer: 885/1024 Audio Buffer: 797/1024
Bytes copied for buffer 0xc292ac0: 1046016
-----Here the audio stop for 4 or 5 seconds
-----then continues for 4 or 5 seconds 
DTS: 1961280.000000 time base: 0.000011 StartDTS: 303347520 Orig DTS: 305308800
Video Buffer: 765/1024 Audio Buffer: 797/1024
Bytes copied for buffer 0x1225f8b0: 1046016
-----Here the audio stop for 4 or 5 seconds
-----then continues for 4 or 5 seconds 
DTS: 2451600.000000 time base: 0.000011 StartDTS: 303347520 Orig DTS: 305799120
Video Buffer: 644/1024 Audio Buffer: 798/1024
Bytes copied for buffer 0x13380840: 1046016
...

if I reduce the buffer, the silence and sound time is reduced. So I want to know how to fix it? Thanks!!

I've got a similar problem, though not quite the same. I'm not even sure if iOS looks at mSampleTime (need to experiment with that), but one thing that jumps out at me is that mSampleTime is NOT in units of time, but it's in units of samples. Instead of

 if(buffer->mPacketDescriptionCount ==0){ bufferStartTime.mFlags = kAudioTimeStampSampleTimeValid; bufferStartTime.mSampleTime = (Float64)(dts-aqStartDts); } 

To set the bufferStartTime, I'm using the following:

AudioTimeStamp presentationTime = { 0 };
AudioTimeStamp actualStartTime = { 0 };

presentationTime.mFlags = kAudioTimeStampSampleTimeValid;
presentationTime.mSampleTime = (Float64) (block->presentationTime / (Float64) 1000000000.0) * (Float64) format.mSampleRate;

AudioQueueEnqueueBufferWithParameters(output, buffer, 0, NULL, trimFramesStart, trimFramesEnd, 0, NULL, &presentationTime, &actualStartTime);

Of course above code snippets refferences data structures I'm not defining here (I'm not defining my block object, and the preseentationTime is in nano seconds). Again, I myself seem to have a time drift when reading back the time from the audio queue via AudioQueueGetCurrentTime (synchronizing the video to that effective position in audio, there seems to be a tiny constant offset that compounds A/V descynchronization over time for me). Have not played with other audio formats yet.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM