简体   繁体   English

如何附加到录制的MPEG4 AAC文件?

[英]How can I append to a recorded MPEG4 AAC file?

I'm recording audio on an iPhone, using an AVAudioRecorder with the following settings: 我正在使用具有以下设置的AVAudioRecorder在iPhone上录制音频:

NSMutableDictionary *recordSettings = [[NSDictionary alloc] initWithObjectsAndKeys:
       [NSNumber numberWithInt: kAudioFormatMPEG4AAC], AVFormatIDKey,
       [NSNumber numberWithFloat:44100.0], AVSampleRateKey,
       [NSNumber numberWithInt:1], AVNumberOfChannelsKey,
       [NSNumber numberWithInt:12800], AVEncoderBitRateKey,
       [NSNumber numberWithInt:16], AVLinearPCMBitDepthKey,
       [NSNumber numberWithInt: AVAudioQualityHigh],  AVEncoderAudioQualityKey,
       nil];

(I can be flexible on most of these settings, but I have to use MPEG4 AAC.) (我可以灵活处理大多数这些设置,但我必须使用MPEG4 AAC。)

I save the audio to a file. 我将音频保存到文件中。

The user needs to be able to come back at a later date and continue recording to the same file. 用户需要能够在以后返回并继续录制到同一文件。 There doesn't seem to be an option to do this directly with AVAudioRecorder , so instead I'm recording to a new file and concatenating them. 似乎没有选择直接使用AVAudioRecorder执行此AVAudioRecorder ,因此我正在录制到一个新文件并连接它们。

At the moment I'm appending the files using an AVMutableComposition and an AVMutableCompositionTrack as here , but it's really slow for longer recordings so this isn't really feasible. 目前我使用附加的文件AVMutableCompositionAVMutableCompositionTrack这里 ,但它是更长的录音很慢,所以这是不是真的可行。

I'm thinking it would be much quicker if I could strip the header from the second file, append the audio data to the first file, then alter the header of the combined file to reflect the new duration. 我认为如果我可以从第二个文件中删除标题,将音频数据附加到第一个文件,然后更改组合文件的标题以反映新的持续时间,那会快得多。 As I know both files were created with exactly the same settings, I figure the other details in the headers should be identical. 据我所知,两个文件都是使用完全相同的设置创建的,我认为标题中的其他细节应该相同。

Unfortunately I can't find any information about what format the headers are in, or if it's possible to combine files in this way. 遗憾的是,我无法找到有关标头格式的信息,或者是否可以通过这种方式组合文件。

So my questions are: 所以我的问题是:

  • What is the format of the MPEG-4 AAC file header, when created on an iPhone? 在iPhone上创建时,MPEG-4 AAC文件头的格式是什么?
  • Can I combine two audio files by messing with the headers like this? 我可以通过弄乱像这样的标题来组合两个音频文件吗?
  • Is there a better way of appending two MPEG-4 AAC audio files almost instantaneously? 有几种瞬间附加两个MPEG-4 AAC音频文件的更好方法吗?

Though we ask the AVAudioRecorder to record in MPEG4-AAC format, it always produces a .caf (Core Audio Format) file. 虽然我们要求AVAudioRecorder以MPEG4-AAC格式录制,但它始终会生成.caf(核心音频格式)文件。 This is just a wrapper format, however, and the actual audio data it contains is in AAC format. 这只是一种包装格式,它包含的实际音频数据是AAC格式。

In the end, appending files came down to manipulating the .caf files byte-by-byte. 最后,附加文件归结为逐字节操作.caf文件。 The spec for Core Audio Format files is here . Core Audio Format文件的规范在这里 Digesting this document and processing the files accordingly was a little off-putting at first, but it turns out the spec is very clear and complete, so it wasn't too onerous. 消化此文档并相应地处理文件起初有点令人反感,但事实证明规范非常清晰和完整,所以它并不太繁琐。

As the spec explains, .caf files consist of chunks with four-byte names at the beginning. 正如规范所解释的那样,.caf文件由开头的四字节名称组成。 For AAC files, there's always a desc chunk and a kuki chunk. 对于AAC文件,总有一个desc块和一个kuki块。 As we know our two original files are in the same format, we can copy these chunks unchanged to the output file. 我们知道我们的两个原始文件格式相同,我们可以将这些块不变地复制到输出文件中。

There's also a pakt chunk and a data chunk. 还有一个pakt块和一个data块。 We can't guarantee which order these will be in within the input files. 我们无法保证输入文件中的这些顺序。 There may or may not be a free chunk - but this just contains padding 0x00's, so we needn't copy this to the output file. 可能有也可能没有free块 - 但这只包含填充0x00,所以我们不需要将它复制到输出文件中。

To combine the pakt chunks, we need to examine the chunk headers and produce a new pakt chunk whose mNumberPackets and mNumberValidFrames fields are the sums of those in the input files. 要组合pakt块,我们需要检查mNumberPackets并生成一个新的pakt块,其mNumberPacketsmNumberValidFrames字段是输入文件中的那些字段的总和。 The mPrimingFrames and mRemainderFrames are always zero - these are only relevant for streaming media. mPrimingFramesmRemainderFrames始终为零 - 这些仅与流媒体相关。 The bulk of the pakt chunks (ie. the actual packet table data) can just be concatenated. 大块的pakt块(即实际的数据包表数据)可以连接起来。

Similarly for the data chunks: the mChunkSize fields need to be summed and then the bulk of the data can be concatenated. 类似地,对于data块:需要对mChunkSize字段求和,然后可以连接大部分数据。

Be careful when reading data from all the binary numeric fields within these files: the files are big-endian but the iPhone is little-endian. 从这些文件中的所有二进制数字字段读取数据时要小心:文件是big-endian但iPhone是little-endian。

For extra credit, you might also like to consider deleting segments of audio from within a file, or inserting one audio file into the middle of another. 对于额外的功劳,您可能还想考虑从文件中删除音频片段,或者将一个音频文件插入另一个音频文件的中间。 This is a little trickier as you have to parse the contents of the pakt chunk. 这有点棘手,因为你必须解析pakt块的内容。 Again it's a case of following the spec: there's a good description of how the packet sizes are stored in variable-length integers, so you'll have to parse these to find how many bytes each packet takes up in the data chunk, and calculate their positions accordingly. 再次,这是遵循规范的情况:有一个很好的描述数据包大小如何存储在可变长度的整数中,所以你必须解析这些,以找出每个数据包在data块中占用多少字节,并计算相应的位置。

All in all this is rather more hassle than I was hoping for. 总而言之,这比我希望的更麻烦。 Maybe there's an open source library that will do all this for you, but I couldn't find one. 也许有一个开源库可以为你做这一切,但我找不到一个。

However, handling raw files like this is blinding fast compared to using AVMutableComposition and AVMutableCompositionTrack as in the original question - inserting an hour-long recording into another of the same length takes about two seconds. 但是,与原始问题中使用AVMutableCompositionAVMutableCompositionTrack相比,处理这样的原始文件是快速致盲的 - 将一小时的记录插入另一个相同长度的记录大约需要两秒钟。

Good luck! 祝好运!

I found a way that was much faster to implement: 我找到了一种实现起来快得多的方法:

  1. Use AVAudioRecorder and use the extension "m4a" for a temporary file, you can however also use "caf" if you want but it's unnecessary. 使用AVAudioRecorder并使用扩展名“m4a”作为临时文件,但是如果需要也可以使用“caf”,但这是不必要的。

  2. Modify the code here to use AVAssetExportPresetPassthrough and exportSession.outputFileType = AVFileTypeQuickTimeMovie and a filename "audioJoined.mov". 修改此处的代码以使用AVAssetExportPresetPassthrough和exportSession.outputFileType = AVFileTypeQuickTimeMovie以及文件名“audioJoined.mov”。 Use your newly recorded temporary m4a and an existing m4a file. 使用新录制的临时m4a和现有的m4a文件。 This gives you an instant join (no recompression) and produces a "mov". 这为您提供了即时连接(无重新压缩)并生成“mov”。

Note. 注意。 Unfortunately the AVAudioPlayer cannot play a "mov" so the next step is to convert it to something playable. 不幸的是,AVAudioPlayer无法播放“mov”,因此下一步是将其转换为可播放的内容。 However, if you are just going to share the file somewhere you could potentially skip the next step since the mov is perfectly playable on a Mac in Quicktime. 但是,如果你只是想在某个地方共享文件,那么你可能会跳过下一步,因为在Quicktime中可以在Mac上完全播放mov。 It also can be played in iTunes and synced back to an iPhone and plays in the iPod app. 它也可以在iTunes中播放并同步回iPhone并在iPod应用程序中播放。

  1. Convert the mov back to a m4a using [[AVAssetExportSession alloc] initWithAsset:movFileAsset presetName:AVAssetExportPresetAppleM4A], @"audioJoined.m4a" for the filename and exportSession.outputFileType = AVFileTypeAppleM4A. 使用[[AVAssetExportSession alloc] initWithAsset:movFileAsset presetName:AVAssetExportPresetAppleM4A]将文件转换回m4a,文件名为@“audioJoined.m4a”,exportSession.outputFileType = AVFileTypeAppleM4A。 Again, this is instant. 再次,这是即时的。 I'm guessing that the exporter is smarter in this situation when it starts with a mov asset rather than a AVMutableComposition asset. 我猜这个出口商在这种情况下更聪明,因为它以mov资产而不是AVMutableComposition资产开始。

I'm using this technique in an app that is able to resume recording after recording has been stopped and the file has been played, or even if the app is restarted, pretty cool. 我在一个应用程序中使用此技术,该应用程序能够在录制停止并且文件已播放后恢复录制,或者即使应用程序重新启动,也非常酷。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM