[英]iOS reverse audio through AVAssetWriter
我正在嘗試使用 AVAsset 和 AVAssetWriter 在 iOS 中反轉音頻。 以下代碼有效,但輸出文件比輸入文件短。 例如,輸入文件具有 1:59 的持續時間,但輸出具有相同音頻內容的 1:50。
- (void)reverse:(AVAsset *)asset
{
AVAssetReader* reader = [[AVAssetReader alloc] initWithAsset:asset error:nil];
AVAssetTrack* audioTrack = [[asset tracksWithMediaType:AVMediaTypeAudio] objectAtIndex:0];
NSMutableDictionary* audioReadSettings = [NSMutableDictionary dictionary];
[audioReadSettings setValue:[NSNumber numberWithInt:kAudioFormatLinearPCM]
forKey:AVFormatIDKey];
AVAssetReaderTrackOutput* readerOutput = [AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:audioTrack outputSettings:audioReadSettings];
[reader addOutput:readerOutput];
[reader startReading];
NSDictionary *outputSettings = [NSDictionary dictionaryWithObjectsAndKeys:
[NSNumber numberWithInt: kAudioFormatMPEG4AAC], AVFormatIDKey,
[NSNumber numberWithFloat:44100.0], AVSampleRateKey,
[NSNumber numberWithInt:2], AVNumberOfChannelsKey,
[NSNumber numberWithInt:128000], AVEncoderBitRateKey,
[NSData data], AVChannelLayoutKey,
nil];
AVAssetWriterInput *writerInput = [[AVAssetWriterInput alloc] initWithMediaType:AVMediaTypeAudio
outputSettings:outputSettings];
NSString *exportPath = [NSTemporaryDirectory() stringByAppendingPathComponent:@"out.m4a"];
NSURL *exportURL = [NSURL fileURLWithPath:exportPath];
NSError *writerError = nil;
AVAssetWriter *writer = [[AVAssetWriter alloc] initWithURL:exportURL
fileType:AVFileTypeAppleM4A
error:&writerError];
[writerInput setExpectsMediaDataInRealTime:NO];
[writer addInput:writerInput];
[writer startWriting];
[writer startSessionAtSourceTime:kCMTimeZero];
CMSampleBufferRef sample = [readerOutput copyNextSampleBuffer];
NSMutableArray *samples = [[NSMutableArray alloc] init];
while (sample != NULL) {
sample = [readerOutput copyNextSampleBuffer];
if (sample == NULL)
continue;
[samples addObject:(__bridge id)(sample)];
CFRelease(sample);
}
NSArray* reversedSamples = [[samples reverseObjectEnumerator] allObjects];
for (id reversedSample in reversedSamples) {
if (writerInput.readyForMoreMediaData) {
[writerInput appendSampleBuffer:(__bridge CMSampleBufferRef)(reversedSample)];
}
else {
[NSThread sleepForTimeInterval:0.05];
}
}
[writerInput markAsFinished];
dispatch_queue_t queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0);
dispatch_async(queue, ^{
[writer finishWriting];
});
}
更新:
如果我直接在第一個while
循環中寫入樣本 - 一切正常(即使使用writerInput.readyForMoreMediaData
檢查)。 在這種情況下,結果文件與原始文件的持續時間完全相同。 但是如果我從反向NSArray
寫入相同的樣本 - 結果會更短。
以相反的順序寫入音頻樣本是不夠的。 樣本數據需要自己反轉,其時序信息需要正確設置。
在 Swift 中,我們為 AVAsset 創建了一個擴展。
樣本必須作為解壓縮樣本進行處理。 為此,使用 kAudioFormatLinearPCM 創建音頻閱讀器設置:
let kAudioReaderSettings = [
AVFormatIDKey: Int(kAudioFormatLinearPCM) as AnyObject,
AVLinearPCMBitDepthKey: 16 as AnyObject,
AVLinearPCMIsBigEndianKey: false as AnyObject,
AVLinearPCMIsFloatKey: false as AnyObject,
AVLinearPCMIsNonInterleaved: false as AnyObject]
使用我們的 AVAsset 擴展方法 audioReader:
func audioReader(outputSettings: [String : Any]?) -> (audioTrack:AVAssetTrack?, audioReader:AVAssetReader?, audioReaderOutput:AVAssetReaderTrackOutput?) {
if let audioTrack = self.tracks(withMediaType: .audio).first {
if let audioReader = try? AVAssetReader(asset: self) {
let audioReaderOutput = AVAssetReaderTrackOutput(track: audioTrack, outputSettings: outputSettings)
return (audioTrack, audioReader, audioReaderOutput)
}
}
return (nil, nil, nil)
}
let (_, audioReader, audioReaderOutput) = self.audioReader(outputSettings: kAudioReaderSettings)
創建用於讀取音頻樣本的 audioReader (AVAssetReader) 和 audioReaderOutput (AVAssetReaderTrackOutput)。
我們需要跟蹤音頻樣本和新的時間信息:
var audioSamples:[CMSampleBuffer] = []
var timingInfos:[CMSampleTimingInfo] = []
現在開始閱讀樣本。 對於每個音頻樣本,獲取其計時信息以生成新的計時信息,這些信息將與音軌的末尾相關(因為我們將以相反的順序將其寫回)。
換句話說,我們將調整樣本的呈現時間。
if audioReader.startReading() {
while audioReader.status == .reading {
if let sampleBuffer = audioReaderOutput.copyNextSampleBuffer(){
// process sample
}
}
}
所以為了“處理樣本”,我們使用 CMSampleBufferGetSampleTimingInfoArray 來獲取timingInfo(CMSampleTimingInfo):
var timingInfo = CMSampleTimingInfo()
CMSampleBufferGetSampleTimingInfoArray(sampleBuffer, entryCount: 0, arrayToFill: &timingInfo, entriesNeededOut: &timingInfoCount)
獲取演示時間和持續時間:
let presentationTime = timingInfo.presentationTimeStamp
let duration = CMSampleBufferGetDuration(sampleBuffer)
計算樣本的結束時間:
let endTime = CMTimeAdd(presentationTime, duration)
現在計算相對於曲目結束的新呈現時間:
let newPresentationTime = CMTimeSubtract(self.duration, endTime)
並使用它來設置timingInfo:
timingInfo.presentationTimeStamp = newPresentationTime
最后保存音頻樣本緩沖區及其時序信息,我們稍后在創建反向樣本時需要它:
timingInfos.append(timingInfo)
audioSamples.append(sampleBuffer)
我們需要一個 AVAssetWriter:
guard let assetWriter = try? AVAssetWriter(outputURL: destinationURL, fileType: AVFileType.wav) else {
// error handling
return
}
文件類型為“wav”,因為反向采樣將被寫入為未壓縮的音頻格式 Linear PCM,如下所示。
對於 assetWriter,我們指定音頻壓縮設置和“源格式提示”,並且可以從未壓縮的樣本緩沖區中獲取它:
let sampleBuffer = audioSamples[0]
let sourceFormat = CMSampleBufferGetFormatDescription(sampleBuffer)
let audioCompressionSettings = [AVFormatIDKey: kAudioFormatLinearPCM] as [String : Any]
現在我們可以創建 AVAssetWriterInput,將其添加到 writer 並開始編寫:
let assetWriterInput = AVAssetWriterInput(mediaType: AVMediaType.audio, outputSettings:audioCompressionSettings, sourceFormatHint: sourceFormat)
assetWriter.add(assetWriterInput)
assetWriter.startWriting()
assetWriter.startSession(atSourceTime: CMTime.zero)
現在以相反的順序遍歷樣本,並為每個反轉樣本本身。
我們有一個 CMSampleBuffer 的擴展,它就是這樣做的,稱為“反向”。
使用 requestMediaDataWhenReady 我們按如下方式執行此操作:
let nbrSamples = audioSamples.count
var index = 0
let serialQueue: DispatchQueue = DispatchQueue(label: "com.limit-point.reverse-audio-queue")
assetWriterInput.requestMediaDataWhenReady(on: serialQueue) {
while assetWriterInput.isReadyForMoreMediaData, index < nbrSamples {
let sampleBuffer = audioSamples[nbrSamples - 1 - index]
let timingInfo = timingInfos[index]
if let reversedBuffer = sampleBuffer.reverse(timingInfo: [timingInfo]), assetWriterInput.append(reversedBuffer) == true {
index += 1
}
else {
index = nbrSamples
}
if index == nbrSamples {
assetWriterInput.markAsFinished()
finishWriting() // call assetWriter.finishWriting, check assetWriter status, etc.
}
}
}
所以最后要解釋的是如何在“反向”方法中反轉音頻樣本?
我們為 CMSampleBuffer 創建了一個擴展,它接受一個樣本緩沖區並返回正確定時的反向樣本緩沖區,作為 CMSampleBuffer 的擴展:
func reverse(timingInfo:[CMSampleTimingInfo]) -> CMSampleBuffer?
需要反轉的數據需要使用以下方法獲取:
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer
CMSampleBuffer 頭文件對這種方法的描述如下:
“創建一個包含來自 CMSampleBuffer 的數據的 AudioBufferList,以及一個引用(並管理其生命周期)該 AudioBufferList 中數據的 CMBlockBuffer。”
如下調用它,其中“self”指的是我們正在反轉的 CMSampleBuffer,因為這是一個擴展:
var blockBuffer: CMBlockBuffer? = nil
let audioBufferList: UnsafeMutableAudioBufferListPointer = AudioBufferList.allocate(maximumBuffers: 1)
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(
self,
bufferListSizeNeededOut: nil,
bufferListOut: audioBufferList.unsafeMutablePointer,
bufferListSize: AudioBufferList.sizeInBytes(maximumBuffers: 1),
blockBufferAllocator: nil,
blockBufferMemoryAllocator: nil,
flags: kCMSampleBufferFlag_AudioBufferList_Assure16ByteAlignment,
blockBufferOut: &blockBuffer
)
現在您可以通過以下方式訪問原始數據:
let data: UnsafeMutableRawPointer = audioBufferList.unsafePointer.pointee.mBuffers.mData
反轉數據我們需要訪問數據作為一個名為 sampleArray 的“樣本”數組,並在 Swift 中按如下方式完成:
let samples = data.assumingMemoryBound(to: Int16.self)
let sizeofInt16 = MemoryLayout<Int16>.size
let dataSize = audioBufferList.unsafePointer.pointee.mBuffers.mDataByteSize
let dataCount = Int(dataSize) / sizeofInt16
var sampleArray = Array(UnsafeBufferPointer(start: samples, count: dataCount)) as [Int16]
現在反轉數組 sampleArray:
sampleArray.reverse()
使用反向樣本,我們需要創建一個新的 CMSampleBuffer,其中包含反向樣本和我們之前在從源文件讀取音頻樣本時生成的新計時信息。
現在我們用 CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer 替換我們之前獲得的 CMBlockBuffer 中的數據:
首先使用反向數組重新分配“樣本”:
var status:OSStatus = noErr
sampleArray.withUnsafeBytes { sampleArrayPtr in
if let baseAddress = sampleArrayPtr.baseAddress {
let bufferPointer: UnsafePointer<Int16> = baseAddress.assumingMemoryBound(to: Int16.self)
let rawPtr = UnsafeRawPointer(bufferPointer)
status = CMBlockBufferReplaceDataBytes(with: rawPtr, blockBuffer: blockBuffer!, offsetIntoDestination: 0, dataLength: Int(dataSize))
}
}
if status != noErr {
return nil
}
最后使用 CMSampleBufferCreate 創建新的樣本緩沖區。 該函數需要我們可以從原始樣本緩沖區中獲取的兩個參數,即 formatDescription 和 numberOfSamples:
let formatDescription = CMSampleBufferGetFormatDescription(self)
let numberOfSamples = CMSampleBufferGetNumSamples(self)
var newBuffer:CMSampleBuffer?
現在使用反向塊緩沖區創建新的樣本緩沖區,最值得注意的是作為參數傳遞給我們正在定義的函數“reverse”的新計時信息:
guard CMSampleBufferCreate(allocator: kCFAllocatorDefault, dataBuffer: blockBuffer, dataReady: true, makeDataReadyCallback: nil, refcon: nil, formatDescription: formatDescription, sampleCount: numberOfSamples, sampleTimingEntryCount: timingInfo.count, sampleTimingArray: timingInfo, sampleSizeEntryCount: 0, sampleSizeArray: nil, sampleBufferOut: &newBuffer) == noErr else {
return self
}
return newBuffer
這就是全部!
最后要注意的是,Core Audio 和 AVFoundation 標頭提供了許多有用的信息,例如 CoreAudioTypes.h、CMSampleBuffer.h 等等。
使用 Swift 5 將視頻和音頻反向轉換為相同資產輸出的完整示例,使用上述建議處理音頻:
private func reverseVideo(inURL: URL, outURL: URL, queue: DispatchQueue, _ completionBlock: ((Bool)->Void)?) {
Log.info("Start reverse video!")
let asset = AVAsset.init(url: inURL)
guard
let reader = try? AVAssetReader.init(asset: asset),
let videoTrack = asset.tracks(withMediaType: .video).first,
let audioTrack = asset.tracks(withMediaType: .audio).first
else {
assert(false)
completionBlock?(false)
return
}
let width = videoTrack.naturalSize.width
let height = videoTrack.naturalSize.height
// Video reader
let readerVideoSettings: [String : Any] = [ String(kCVPixelBufferPixelFormatTypeKey) : kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange,]
let readerVideoOutput = AVAssetReaderTrackOutput.init(track: videoTrack, outputSettings: readerVideoSettings)
reader.add(readerVideoOutput)
// Audio reader
let readerAudioSettings: [String : Any] = [
AVFormatIDKey: kAudioFormatLinearPCM,
AVLinearPCMBitDepthKey: 16 ,
AVLinearPCMIsBigEndianKey: false ,
AVLinearPCMIsFloatKey: false,]
let readerAudioOutput = AVAssetReaderTrackOutput.init(track: audioTrack, outputSettings: readerAudioSettings)
reader.add(readerAudioOutput)
//Start reading content
reader.startReading()
//Reading video samples
var videoBuffers = [CMSampleBuffer]()
while let nextBuffer = readerVideoOutput.copyNextSampleBuffer() {
videoBuffers.append(nextBuffer)
}
//Reading audio samples
var audioBuffers = [CMSampleBuffer]()
var timingInfos = [CMSampleTimingInfo]()
while let nextBuffer = readerAudioOutput.copyNextSampleBuffer() {
var timingInfo = CMSampleTimingInfo()
var timingInfoCount = CMItemCount()
CMSampleBufferGetSampleTimingInfoArray(nextBuffer, entryCount: 0, arrayToFill: &timingInfo, entriesNeededOut: &timingInfoCount)
let duration = CMSampleBufferGetDuration(nextBuffer)
let endTime = CMTimeAdd(timingInfo.presentationTimeStamp, duration)
let newPresentationTime = CMTimeSubtract(duration, endTime)
timingInfo.presentationTimeStamp = newPresentationTime
timingInfos.append(timingInfo)
audioBuffers.append(nextBuffer)
}
//Stop reading
let status = reader.status
reader.cancelReading()
guard status == .completed, let firstVideoBuffer = videoBuffers.first, let firstAudioBuffer = audioBuffers.first else {
assert(false)
completionBlock?(false)
return
}
//Start video time
let sessionStartTime = CMSampleBufferGetPresentationTimeStamp(firstVideoBuffer)
//Writer for video
let writerVideoSettings: [String:Any] = [
AVVideoCodecKey : AVVideoCodecType.h264,
AVVideoWidthKey : width,
AVVideoHeightKey: height,
]
let writerVideoInput: AVAssetWriterInput
if let formatDescription = videoTrack.formatDescriptions.last {
writerVideoInput = AVAssetWriterInput.init(mediaType: .video, outputSettings: writerVideoSettings, sourceFormatHint: (formatDescription as! CMFormatDescription))
} else {
writerVideoInput = AVAssetWriterInput.init(mediaType: .video, outputSettings: writerVideoSettings)
}
writerVideoInput.transform = videoTrack.preferredTransform
writerVideoInput.expectsMediaDataInRealTime = false
//Writer for audio
let writerAudioSettings: [String:Any] = [
AVFormatIDKey : kAudioFormatMPEG4AAC,
AVSampleRateKey : 44100,
AVNumberOfChannelsKey: 2,
AVEncoderBitRateKey:128000,
AVChannelLayoutKey: NSData(),
]
let sourceFormat = CMSampleBufferGetFormatDescription(firstAudioBuffer)
let writerAudioInput: AVAssetWriterInput = AVAssetWriterInput.init(mediaType: .audio, outputSettings: writerAudioSettings, sourceFormatHint: sourceFormat)
writerAudioInput.expectsMediaDataInRealTime = true
guard
let writer = try? AVAssetWriter.init(url: outURL, fileType: .mp4),
writer.canAdd(writerVideoInput),
writer.canAdd(writerAudioInput)
else {
assert(false)
completionBlock?(false)
return
}
let pixelBufferAdaptor = AVAssetWriterInputPixelBufferAdaptor.init(assetWriterInput: writerVideoInput, sourcePixelBufferAttributes: nil)
let group = DispatchGroup.init()
group.enter()
writer.add(writerVideoInput)
writer.add(writerAudioInput)
writer.startWriting()
writer.startSession(atSourceTime: sessionStartTime)
var videoFinished = false
var audioFinished = false
//Write video samples in reverse order
var currentSample = 0
writerVideoInput.requestMediaDataWhenReady(on: queue) {
for i in currentSample..<videoBuffers.count {
currentSample = i
if !writerVideoInput.isReadyForMoreMediaData {
return
}
let presentationTime = CMSampleBufferGetPresentationTimeStamp(videoBuffers[i])
guard let imageBuffer = CMSampleBufferGetImageBuffer(videoBuffers[videoBuffers.count - i - 1]) else {
Log.info("VideoWriter reverseVideo: warning, could not get imageBuffer from SampleBuffer...")
continue
}
if !pixelBufferAdaptor.append(imageBuffer, withPresentationTime: presentationTime) {
Log.info("VideoWriter reverseVideo: warning, could not append imageBuffer...")
}
}
// finish write video samples
writerVideoInput.markAsFinished()
Log.info("Video writing finished!")
videoFinished = true
if(audioFinished){
group.leave()
}
}
//Write audio samples in reverse order
let totalAudioSamples = audioBuffers.count
writerAudioInput.requestMediaDataWhenReady(on: queue) {
for i in 0..<totalAudioSamples-1 {
if !writerAudioInput.isReadyForMoreMediaData {
return
}
let audioSample = audioBuffers[totalAudioSamples-1-i]
let timingInfo = timingInfos[i]
// reverse samples data using timing info
if let reversedBuffer = audioSample.reverse(timingInfo: [timingInfo]) {
// append data
if writerAudioInput.append(reversedBuffer) == false {
break
}
}
}
// finish
writerAudioInput.markAsFinished()
Log.info("Audio writing finished!")
audioFinished = true
if(videoFinished){
group.leave()
}
}
group.notify(queue: queue) {
writer.finishWriting {
if writer.status != .completed {
Log.info("VideoWriter reverse video: error - \(String(describing: writer.error))")
completionBlock?(false)
} else {
Log.info("Ended reverse video!")
completionBlock?(true)
}
}
}
}
快樂編碼!
以樣本數打印出每個緩沖區的大小(通過“讀取”readerOuput while 循環),並在“寫入” writerInput for 循環中重復。 通過這種方式,您可以查看所有緩沖區大小並查看它們是否相加。
例如, if (writerInput.readyForMoreMediaData)
為假,您是否丟失或跳過緩沖區,您“睡眠”,但隨后繼續處理 reversedSamples 中的下一個 reversedSample(該緩沖區實際上已從 writerInput 中刪除)
UPDATE (基於評論):我在代碼中發現,有兩個問題:
[NSNumber numberWithInt:1], AVNumberOfChannelsKey
。查看輸出和輸入文件的信息: size_t sampleSize = CMSampleBufferGetNumSamples(sample);
所以第 76 行現在是: size_t sampleSize = CMSampleBufferGetNumSamples(sample);
輸出看起來像:
2015-03-19 22:26:28.171 audioReverse[25012:4901250] Reading [0]: 8192
2015-03-19 22:26:28.172 audioReverse[25012:4901250] Reading [1]: 8192
...
2015-03-19 22:26:28.651 audioReverse[25012:4901250] Reading [640]: 8192
2015-03-19 22:26:28.651 audioReverse[25012:4901250] Reading [641]: 8192
2015-03-19 22:26:28.651 audioReverse[25012:4901250] Reading [642]: 5056
2015-03-19 22:26:28.651 audioReverse[25012:4901250] Writing [0]: 5056
2015-03-19 22:26:28.652 audioReverse[25012:4901250] Writing [1]: 8192
...
2015-03-19 22:26:29.134 audioReverse[25012:4901250] Writing [640]: 8192
2015-03-19 22:26:29.135 audioReverse[25012:4901250] Writing [641]: 8192
2015-03-19 22:26:29.135 audioReverse[25012:4901250] Writing [642]: 8192
這表明您正在顛倒 8192 個樣本的每個緩沖區的順序,但在每個緩沖區中,音頻仍然“面向前方”。 我們可以在這個屏幕截圖中看到這一點,我正確地反轉(逐個樣本)與緩沖區反轉:
我認為如果您還反轉每個 8192 緩沖區的每個樣本,您當前的方案就可以工作。 我個人不建議使用 NSArray 枚舉器進行信號處理,但如果您在樣本級別操作,它可以工作。
extension CMSampleBuffer {
func reverse(timingInfo:[CMSampleTimingInfo]) -> CMSampleBuffer? {
var blockBuffer: CMBlockBuffer? = nil
let audioBufferList: UnsafeMutableAudioBufferListPointer = AudioBufferList.allocate(maximumBuffers: 1)
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(
self,
bufferListSizeNeededOut: nil,
bufferListOut: audioBufferList.unsafeMutablePointer,
bufferListSize: AudioBufferList.sizeInBytes(maximumBuffers: 1),
blockBufferAllocator: nil,
blockBufferMemoryAllocator: nil,
flags: kCMSampleBufferFlag_AudioBufferList_Assure16ByteAlignment,
blockBufferOut: &blockBuffer
)
if let data = audioBufferList.unsafePointer.pointee.mBuffers.mData {
let samples = data.assumingMemoryBound(to: Int16.self)
let sizeofInt16 = MemoryLayout<Int16>.size
let dataSize = audioBufferList.unsafePointer.pointee.mBuffers.mDataByteSize
let dataCount = Int(dataSize) / sizeofInt16
var sampleArray = Array(UnsafeBufferPointer(start: samples, count: dataCount)) as [Int16]
sampleArray.reverse()
var status:OSStatus = noErr
sampleArray.withUnsafeBytes { sampleArrayPtr in
if let baseAddress = sampleArrayPtr.baseAddress {
let bufferPointer: UnsafePointer<Int16> = baseAddress.assumingMemoryBound(to: Int16.self)
let rawPtr = UnsafeRawPointer(bufferPointer)
status = CMBlockBufferReplaceDataBytes(with: rawPtr, blockBuffer: blockBuffer!, offsetIntoDestination: 0, dataLength: Int(dataSize))
}
}
if status != noErr {
return nil
}
let formatDescription = CMSampleBufferGetFormatDescription(self)
let numberOfSamples = CMSampleBufferGetNumSamples(self)
var newBuffer:CMSampleBuffer?
guard CMSampleBufferCreate(allocator: kCFAllocatorDefault, dataBuffer: blockBuffer, dataReady: true, makeDataReadyCallback: nil, refcon: nil, formatDescription: formatDescription, sampleCount: numberOfSamples, sampleTimingEntryCount: timingInfo.count, sampleTimingArray: timingInfo, sampleSizeEntryCount: 0, sampleSizeArray: nil, sampleBufferOut: &newBuffer) == noErr else {
return self
}
return newBuffer
}
return nil
}
}
錯過了功能!
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.