如何在iOS中录制语音时以编程方式生成音频波形？

Question

在此输入图像描述

How to generate audio wave form programmatically while recording Voice in iOS? 如何在iOS中录制语音时以编程方式生成音频波形？

m working on voice modulation audio frequency in iOS... everything is working fine ...just need some best simple way to generate audio wave form on detection noise... 我正在研究iOS中的语音调制音频...一切正常......只需要一些最简单的方法就可以在检测噪声上生成音频波形...

Please dont refer me the code tutorials of...speakhere and auriotouch... i need some best suggestions from native app developers. 请不要向我推荐... speakhere和auriotouch的代码教程...我需要一些来自本机应用程序开发人员的最佳建议。

I have recorded the audio and i made it play after recording . 我录制了音频，录制后我播放了它。 I have created waveform and attached screenshot . 我创建了波形和附加截图。 But it has to been drawn in the view as audio recording in progress 但它必须在视图中被描绘为录音正在进行中

-(UIImage *) audioImageGraph:(SInt16 *) samples
                normalizeMax:(SInt16) normalizeMax
                 sampleCount:(NSInteger) sampleCount
                channelCount:(NSInteger) channelCount
                 imageHeight:(float) imageHeight {

    CGSize imageSize = CGSizeMake(sampleCount, imageHeight);
    UIGraphicsBeginImageContext(imageSize);
    CGContextRef context = UIGraphicsGetCurrentContext();

    CGContextSetFillColorWithColor(context, [UIColor blackColor].CGColor);
    CGContextSetAlpha(context,1.0);
    CGRect rect;
    rect.size = imageSize;
    rect.origin.x = 0;
    rect.origin.y = 0;

    CGColorRef leftcolor = [[UIColor whiteColor] CGColor];
    CGColorRef rightcolor = [[UIColor redColor] CGColor];

    CGContextFillRect(context, rect);

    CGContextSetLineWidth(context, 1.0);

    float halfGraphHeight = (imageHeight / 2) / (float) channelCount ;
    float centerLeft = halfGraphHeight;
    float centerRight = (halfGraphHeight*3) ;
    float sampleAdjustmentFactor = (imageHeight/ (float) channelCount) / (float) normalizeMax;

    for (NSInteger intSample = 0 ; intSample < sampleCount ; intSample ++ ) {
        SInt16 left = *samples++;
        float pixels = (float) left;
        pixels *= sampleAdjustmentFactor;
        CGContextMoveToPoint(context, intSample, centerLeft-pixels);
        CGContextAddLineToPoint(context, intSample, centerLeft+pixels);
        CGContextSetStrokeColorWithColor(context, leftcolor);
        CGContextStrokePath(context);

        if (channelCount==2) {
            SInt16 right = *samples++;
            float pixels = (float) right;
            pixels *= sampleAdjustmentFactor;
            CGContextMoveToPoint(context, intSample, centerRight - pixels);
            CGContextAddLineToPoint(context, intSample, centerRight + pixels);
            CGContextSetStrokeColorWithColor(context, rightcolor);
            CGContextStrokePath(context);
        }
    }

    // Create new image
    UIImage *newImage = UIGraphicsGetImageFromCurrentImageContext();

    // Tidy up
    UIGraphicsEndImageContext();

    return newImage;
}

Next a method that takes a AVURLAsset, and returns PNG Data 接下来是一个采用AVURLAsset并返回PNG数据的方法

- (NSData *) renderPNGAudioPictogramForAssett:(AVURLAsset *)songAsset {

    NSError * error = nil;


    AVAssetReader * reader = [[AVAssetReader alloc] initWithAsset:songAsset error:&error];

    AVAssetTrack * songTrack = [songAsset.tracks objectAtIndex:0];

    NSDictionary* outputSettingsDict = [[NSDictionary alloc] initWithObjectsAndKeys:

                                        [NSNumber numberWithInt:kAudioFormatLinearPCM],AVFormatIDKey,
                                        //     [NSNumber numberWithInt:44100.0],AVSampleRateKey, /*Not Supported*/
                                        //     [NSNumber numberWithInt: 2],AVNumberOfChannelsKey,    /*Not Supported*/

                                        [NSNumber numberWithInt:16],AVLinearPCMBitDepthKey,
                                        [NSNumber numberWithBool:NO],AVLinearPCMIsBigEndianKey,
                                        [NSNumber numberWithBool:NO],AVLinearPCMIsFloatKey,
                                        [NSNumber numberWithBool:NO],AVLinearPCMIsNonInterleaved,

                                        nil];


    AVAssetReaderTrackOutput* output = [[AVAssetReaderTrackOutput alloc] initWithTrack:songTrack outputSettings:outputSettingsDict];

    [reader addOutput:output];
    [output release];

    UInt32 sampleRate,channelCount;

    NSArray* formatDesc = songTrack.formatDescriptions;
    for(unsigned int i = 0; i < [formatDesc count]; ++i) {
        CMAudioFormatDescriptionRef item = (CMAudioFormatDescriptionRef)[formatDesc objectAtIndex:i];
        const AudioStreamBasicDescription* fmtDesc = CMAudioFormatDescriptionGetStreamBasicDescription (item);
        if(fmtDesc ) {

            sampleRate = fmtDesc->mSampleRate;
            channelCount = fmtDesc->mChannelsPerFrame;

            //    NSLog(@"channels:%u, bytes/packet: %u, sampleRate %f",fmtDesc->mChannelsPerFrame, fmtDesc->mBytesPerPacket,fmtDesc->mSampleRate);
        }
    }


    UInt32 bytesPerSample = 2 * channelCount;
    SInt16 normalizeMax = 0;

    NSMutableData * fullSongData = [[NSMutableData alloc] init];
    [reader startReading];


    UInt64 totalBytes = 0;


    SInt64 totalLeft = 0;
    SInt64 totalRight = 0;
    NSInteger sampleTally = 0;

    NSInteger samplesPerPixel = sampleRate / 50;


    while (reader.status == AVAssetReaderStatusReading){

        AVAssetReaderTrackOutput * trackOutput = (AVAssetReaderTrackOutput *)[reader.outputs objectAtIndex:0];
        CMSampleBufferRef sampleBufferRef = [trackOutput copyNextSampleBuffer];

        if (sampleBufferRef){
            CMBlockBufferRef blockBufferRef = CMSampleBufferGetDataBuffer(sampleBufferRef);

            size_t length = CMBlockBufferGetDataLength(blockBufferRef);
            totalBytes += length;


            NSAutoreleasePool *wader = [[NSAutoreleasePool alloc] init];

            NSMutableData * data = [NSMutableData dataWithLength:length];
            CMBlockBufferCopyDataBytes(blockBufferRef, 0, length, data.mutableBytes);


            SInt16 * samples = (SInt16 *) data.mutableBytes;
            int sampleCount = length / bytesPerSample;
            for (int i = 0; i < sampleCount ; i ++) {

                SInt16 left = *samples++;

                totalLeft  += left;



                SInt16 right;
                if (channelCount==2) {
                    right = *samples++;

                    totalRight += right;
                }

                sampleTally++;

                if (sampleTally > samplesPerPixel) {

                    left  = totalLeft / sampleTally;

                    SInt16 fix = abs(left);
                    if (fix > normalizeMax) {
                        normalizeMax = fix;
                    }


                    [fullSongData appendBytes:&left length:sizeof(left)];

                    if (channelCount==2) {
                        right = totalRight / sampleTally;


                        SInt16 fix = abs(right);
                        if (fix > normalizeMax) {
                            normalizeMax = fix;
                        }


                        [fullSongData appendBytes:&right length:sizeof(right)];
                    }

                    totalLeft   = 0;
                    totalRight  = 0;
                    sampleTally = 0;

                }
            }



            [wader drain];


            CMSampleBufferInvalidate(sampleBufferRef);

            CFRelease(sampleBufferRef);
        }
    }


    NSData * finalData = nil;

    if (reader.status == AVAssetReaderStatusFailed || reader.status == AVAssetReaderStatusUnknown){
        // Something went wrong. return nil

        return nil;
    }

    if (reader.status == AVAssetReaderStatusCompleted){

        NSLog(@"rendering output graphics using normalizeMax %d",normalizeMax);

        UIImage *test = [self audioImageGraph:(SInt16 *)
                         fullSongData.bytes
                                 normalizeMax:normalizeMax
                                  sampleCount:fullSongData.length / 4
                                 channelCount:2
                                  imageHeight:100];

        finalData = imageToData(test);
    }




    [fullSongData release];
    [reader release];

    return finalData;
}

I have 我有

Answer 1

If you want real-time graphics derived from mic input, then use the RemoteIO Audio Unit, which is what most native iOS app developers use for low latency audio, and Metal or Open GL for drawing waveforms, which will give you the highest frame rates. 如果您想要从麦克风输入派生的实时图形，那么使用RemoteIO音频单元，这是大多数原生iOS应用程序开发人员用于低延迟音频，而Metal或Open GL用于绘制波形，这将为您提供最高的帧速率。 You will need completely different code from that provided in your question to do so, as AVAssetRecording, Core Graphic line drawing and png rendering are far far too slow to use. 您将需要与问题中提供的代码完全不同的代码，因为AVAssetRecording，Core Graphic线条绘图和png渲染使用起来太慢了。

Update: with iOS 8 and newer, the Metal API may be able to render graphic visualizations with even greater performance than OpenGL. 更新：使用iOS 8及更新版本，Metal API可以渲染图形可视化，其性能甚至超过OpenGL。

Uodate 2: Here are some code snippets for recording live audio using Audio Units and drawing bit maps using Metal in Swift 3: https://gist.github.com/hotpaw2/f108a3c785c7287293d7e1e81390c20b Uodate 2：下面是一些代码片段，用于使用音频单元录制现场音频，并使用Swift 3中的Metal绘制位图： https ： //gist.github.com/hotpaw2/f108a3c785c7287293d7e1e81390c20b

Answer 2

You should check out EZAudio ( https://github.com/syedhali/EZAudio ), specifically the EZRecorder and the EZAudioPlot (or GPU-accelerated EZAudioPlotGL). 您应该查看EZAudio（ https://github.com/syedhali/EZAudio ），特别是EZRecorder和EZAudioPlot（或GPU加速的EZAudioPlotGL）。

There is also an example project that does exactly what you want, https://github.com/syedhali/EZAudio/tree/master/EZAudioExamples/iOS/EZAudioRecordExample 还有一个示例项目可以完全满足您的需求， https：//github.com/syedhali/EZAudio/tree/master/EZAudioExamples/iOS/EZAudioRecordExample

EDIT: Here's the code inline 编辑：这是代码内联

/// In your interface

/**
 Use a OpenGL based plot to visualize the data coming in
 */
@property (nonatomic,weak) IBOutlet EZAudioPlotGL *audioPlot;
/**
 The microphone component
 */
@property (nonatomic,strong) EZMicrophone *microphone;
/**
 The recorder component
 */
@property (nonatomic,strong) EZRecorder *recorder;

...

/// In your implementation

// Create an instance of the microphone and tell it to use this view controller instance as the delegate
-(void)viewDidLoad {
    self.microphone = [EZMicrophone microphoneWithDelegate:self startsImmediately:YES];
}

// EZMicrophoneDelegate will provide these callbacks
-(void)microphone:(EZMicrophone *)microphone
 hasAudioReceived:(float **)buffer
   withBufferSize:(UInt32)bufferSize
withNumberOfChannels:(UInt32)numberOfChannels {
  dispatch_async(dispatch_get_main_queue(),^{
    // Updates the audio plot with the waveform data
    [self.audioPlot updateBuffer:buffer[0] withBufferSize:bufferSize];
  });
}

-(void)microphone:(EZMicrophone *)microphone hasAudioStreamBasicDescription:(AudioStreamBasicDescription)audioStreamBasicDescription {
  // The AudioStreamBasicDescription of the microphone stream. This is useful when configuring the EZRecorder or telling another component what audio format type to expect.

  // We can initialize the recorder with this ASBD
  self.recorder = [EZRecorder recorderWithDestinationURL:[self testFilePathURL]
                                         andSourceFormat:audioStreamBasicDescription];

}

-(void)microphone:(EZMicrophone *)microphone
    hasBufferList:(AudioBufferList *)bufferList
   withBufferSize:(UInt32)bufferSize
withNumberOfChannels:(UInt32)numberOfChannels {

  // Getting audio data as a buffer list that can be directly fed into the EZRecorder. This is happening on the audio thread - any UI updating needs a GCD main queue block. This will keep appending data to the tail of the audio file.
  if( self.isRecording ){
    [self.recorder appendDataFromBufferList:bufferList
                             withBufferSize:bufferSize];
  }

}

Answer 3

I was searching the same thing. 我正在寻找同样的事情。 (Making wave from the data of the audio recorder). （从录音机的数据中制作波形）。 I found some library that might be helpful and worth to check the code to understand the logic behind this. 我找到了一些可能有用的库，值得检查代码以了解其背后的逻辑。

The calculation is all based with sin and mathematic formula. 计算全部基于罪和数学公式。 This is much simple if you take a look to the code! 如果您看一下代码，这很简单！

https://github.com/stefanceriu/SCSiriWaveformView https://github.com/stefanceriu/SCSiriWaveformView

or 要么

https://github.com/raffael/SISinusWaveView https://github.com/raffael/SISinusWaveView

This is only few examples that you can find on the web. 这只是您可以在网上找到的几个示例。

如何在iOS中录制语音时以编程方式生成音频波形？

问题描述

3 个解决方案

解决方案1
8 2013-06-04 17:27:08

解决方案2
8 2014-01-01 19:30:04

解决方案3
1 2015-05-14 14:10:16

如何在iOS中录制语音时以编程方式生成音频波形？

问题描述

3 个解决方案

解决方案1 8 2013-06-04 17:27:08

解决方案2 8 2014-01-01 19:30:04

解决方案3 1 2015-05-14 14:10:16

解决方案1
8 2013-06-04 17:27:08

解决方案2
8 2014-01-01 19:30:04

解决方案3
1 2015-05-14 14:10:16