翻转相机时的无缝录音，使用 AVCaptureSession 和 AVAssetWriter

Question

I'm looking for a way to maintain a seamless audio track while flipping between front and back camera.我正在寻找一种在前后摄像头之间切换时保持无缝音轨的方法。 Many apps in the market can do this, one example is SnapChat…市场上的许多应用程序都可以做到这一点，一个例子是 SnapChat……

Solutions should use AVCaptureSession and AVAssetWriter.解决方案应该使用 AVCaptureSession 和 AVAssetWriter。 Also it should explicitly not use AVMutableComposition since there is a bug between AVMutableComposition and AVCaptureSession ATM.此外，它不应明确使用 AVMutableComposition，因为 AVMutableComposition 和 AVCaptureSession ATM 之间存在错误。 Also, I can't afford post processing time.另外，我负担不起后期处理时间。

Currently when I change the video input the audio recording skips and becomes out of sync.目前，当我更改视频输入时，录音会跳过并变得不同步。

I'm including the code that could be relevant.我包含了可能相关的代码。

Flip Camera翻转相机

-(void) updateCameraDirection:(CamDirection)vCameraDirection {
    if(session) {
        AVCaptureDeviceInput* currentInput;
        AVCaptureDeviceInput* newInput;
        BOOL videoMirrored = NO;
        switch (vCameraDirection) {
            case CamDirection_Front:
                currentInput = input_Back;
                newInput = input_Front;
                videoMirrored = NO;
                break;
            case CamDirection_Back:
                currentInput = input_Front;
                newInput = input_Back;
                videoMirrored = YES;
                break;
            default:
                break;
        }

        [session beginConfiguration];
        //disconnect old input
        [session removeInput:currentInput];
        //connect new input
        [session addInput:newInput];
        //get new data connection and config
        dataOutputVideoConnection = [dataOutputVideo connectionWithMediaType:AVMediaTypeVideo];
        dataOutputVideoConnection.videoOrientation = AVCaptureVideoOrientationPortrait;
        dataOutputVideoConnection.videoMirrored = videoMirrored;
        //finish
        [session commitConfiguration];
    }
}

Sample Buffer样品缓冲液

- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection {
    //not active
    if(!recordingVideo)
        return;

    //start session if not started
    if(!startedSession) {
        startedSession = YES;
        [assetWriter startSessionAtSourceTime:CMSampleBufferGetPresentationTimeStamp(sampleBuffer)];
    }

    //Process sample buffers
    if (connection == dataOutputAudioConnection) {
        if([assetWriterInputAudio isReadyForMoreMediaData]) {
            BOOL success = [assetWriterInputAudio appendSampleBuffer:sampleBuffer];
            //…
        }

    } else if (connection == dataOutputVideoConnection) {
        if([assetWriterInputVideo isReadyForMoreMediaData]) {        
            BOOL success = [assetWriterInputVideo appendSampleBuffer:sampleBuffer];
            //…
        }
    }
}

Perhaps adjust audio sample timeStamp?也许调整音频采样时间戳？

Answer 1

Hey I was facing the same issue and discovered that after switching cameras the next frame was pushed far out of place.嘿，我遇到了同样的问题，发现在切换相机后，下一帧被推得太远了。 This seemed to shift every frame after that thus causing the the video and audio to be out of sync.这似乎在此之后的每一帧都发生了变化，从而导致视频和音频不同步。 My solution was to shift every misplaced frame to it's correct position after switching cameras.我的解决方案是在切换相机后将每个错位的帧移动到正确的位置。

Sorry my answer will be in Swift 4.2抱歉，我的答案将在 Swift 4.2 中

You'll have to use AVAssetWriterInputPixelBufferAdaptor in order to append the sample buffers at a specify presentation timestamp.您必须使用AVAssetWriterInputPixelBufferAdaptor才能在指定的演示时间戳处附加示例缓冲区。

previousPresentationTimeStamp is the presentation timestamp of the previous frame and currentPresentationTimestamp is as you guessed the presentation timestamp of the current. previousPresentationTimeStamp是前一帧的呈现时间戳， currentPresentationTimestamp就是你猜到的当前的呈现时间戳。 maxFrameDistance worked every well when testing but you can change this to your liking. maxFrameDistance在测试时效果很好，但您可以根据自己的喜好进行更改。

let currentFramePosition = (Double(self.frameRate) * Double(currentPresentationTimestamp.value)) / Double(currentPresentationTimestamp.timescale)
let previousFramePosition = (Double(self.frameRate) * Double(previousPresentationTimeStamp.value)) / Double(previousPresentationTimeStamp.timescale)
var presentationTimeStamp = currentPresentationTimestamp
let maxFrameDistance = 1.1
let frameDistance = currentFramePosition - previousFramePosition
if frameDistance > maxFrameDistance {
    let expectedFramePosition = previousFramePosition + 1.0
    //print("[mwCamera]: Frame at incorrect position moving from \(currentFramePosition) to \(expectedFramePosition)")

    let newFramePosition = ((expectedFramePosition) * Double(currentPresentationTimestamp.timescale)) / Double(self.frameRate)

    let newPresentationTimeStamp = CMTime.init(value: CMTimeValue(newFramePosition), timescale: currentPresentationTimestamp.timescale)

    presentationTimeStamp = newPresentationTimeStamp
}

let success = assetWriterInputPixelBufferAdator.append(pixelBuffer, withPresentationTime: presentationTimeStamp)
if !success, let error = assetWriter.error {
    fatalError(error.localizedDescription)
}

Also please note - This worked because I kept the frame rate consistent, so make sure that you have total control of the capture device's frame rate throughout this process.另请注意- 这有效，因为我保持帧速率一致，因此请确保在整个过程中完全控制捕获设备的帧速率。

I have a repo using this logic here 我有一个在这里使用这个逻辑的仓库

Answer 2

The most 'stable way' to fix this problem - is to 'pause' recording when switching sources.解决此问题的最“稳定方法”是在切换源时“暂停”录制。

But also you can 'fill the gap' with blank video and silent audio frames.但您也可以用空白视频和无声音频帧“填补空白”。 This is what I have implemented in my project.这是我在我的项目中实现的。

So, create boolean to block ability to append new CMSampleBuffer's while switching cameras/microphones and reset it after some delay:因此，创建布尔值来阻止在切换摄像头/麦克风时附加新的 CMSampleBuffer 的能力，并在一些延迟后重置它：

let idleTime = 1.0
self.recordingPaused = true
DispatchQueue.main.asyncAfter(deadline: .now() + idleTime) {
  self.recordingPaused = false
}
writeAllIdleFrames()

In writeAllIdleFrames method you need to calculate how many frames you need to write:在writeAllIdleFrames方法中，您需要计算需要写入的帧数：

func writeAllIdleFrames() {
    let framesPerSecond = 1.0 / self.videoConfig.fps
    let samplesPerSecond = 1024 / self.audioConfig.sampleRate
    
    let videoFramesCount = Int(ceil(self.switchInputDelay / framesPerSecond))
    let audioFramesCount = Int(ceil(self.switchInputDelay / samplesPerSecond))
    
    for index in 0..<max(videoFramesCount, audioFramesCount) {
        // creation synthetic buffers
        
        recordingQueue.async {
            if index < videoFramesCount {
                let pts = self.nextVideoPTS()
                self.writeBlankVideo(pts: pts)
            }
            
            if index < audioFramesCount {
                let pts = self.nextAudioPTS()
                self.writeSilentAudio(pts: pts)
            }
        }
    }
}

How to calculate next PTS?如何计算下一个PTS？

func nextVideoPTS() -> CMTime {
    guard var pts = self.lastVideoRawPTS else { return CMTime.invalid }
    
    let framesPerSecond = 1.0 / self.videoConfig.fps
    let delta = CMTime(value: Int64(framesPerSecond * Double(pts.timescale)),
                       timescale: pts.timescale, flags: pts.flags, epoch: pts.epoch)
    pts = CMTimeAdd(pts, delta)
    return pts
}

Tell me, if you also need code that creates blank/silent video/audio buffers :)告诉我，如果您还需要创建空白/静音视频/音频缓冲区的代码:)

Answer 3

I did manage to find an intermediate solution for the sync problem I found on the Woody Jean-louis solution using is repo.我确实设法为我在Woody Jean-louis解决方案中使用 is repo 发现的同步问题找到了一个中间解决方案。

The results are similar to what instagram does but it seems to work a little bit better.结果类似于 instagram 所做的，但它似乎工作得更好一点。 Basically what I do is to prevent the assetWriterAudioInput to append new samples when switching cameras.基本上我所做的是防止assetWriterAudioInput在切换相机时附加新样本。 There is no way to know exactly when this happens so I figured out that before and after the switch the captureOutput method was sending video samples every 0.02 seconds +- (max 0.04 seconds).无法确切知道这种情况何时发生，所以我发现在切换之前和之后， captureOutput方法每 0.02 秒 +-（最多 0.04 秒）发送一次视频样本。

Knowing this I created a self.lastVideoSampleDate that is updated every time a video sample is appended to assetWriterInputPixelBufferAdator and I only allow the audio sample to be appended to assetWriterAudioInput is that date is lower than 0.05.知道这一点后，我创建了一个self.lastVideoSampleDate ，每次将视频样本附加到assetWriterInputPixelBufferAdator时都会更新它，并且我只允许将音频样本附加到assetWriterAudioInput是该日期低于 0.05。

 if let assetWriterAudioInput = self.assetWriterAudioInput,
            output == self.audioOutput, assetWriterAudioInput.isReadyForMoreMediaData {

            let since = Date().timeIntervalSince(self.lastVideoSampleDate)
            if since < 0.05 {
                let success = assetWriterAudioInput.append(sampleBuffer)
                if !success, let error = assetWriter.error {
                    print(error)
                    fatalError(error.localizedDescription)
                }
            }
        }

  let success = assetWriterInputPixelBufferAdator.append(pixelBuffer, withPresentationTime: presentationTimeStamp)
            if !success, let error = assetWriter.error {
                print(error)
                fatalError(error.localizedDescription)
            }
            self.lastVideoSampleDate = Date()

翻转相机时的无缝录音，使用 AVCaptureSession 和 AVAssetWriter

问题描述

3 个解决方案

解决方案1
3 2019-06-13 03:06:53

解决方案2
1 2021-03-03 15:28:30

解决方案3
0 2020-11-28 01:22:15

翻转相机时的无缝录音，使用 AVCaptureSession 和 AVAssetWriter

问题描述

3 个解决方案

解决方案1 3 2019-06-13 03:06:53

解决方案2 1 2021-03-03 15:28:30

解决方案3 0 2020-11-28 01:22:15

解决方案1
3 2019-06-13 03:06:53

解决方案2
1 2021-03-03 15:28:30

解决方案3
0 2020-11-28 01:22:15