简体   繁体   中英

Streaming Audio Through WebSocket recorded with AudioQueue in iOS

I am doing a transcription app in iOS. So, I have to record the audio in buffer and stream them to the server through socket. So, I have used AudioQueue to record the audio in buffer.

The Audio is being recorded properly in local file. For streaming, I converted audio data to NSData and send it through socket. But, The Audio quality is not good in the server especially the voice is not clear at all. It contains lots of noise in the place of voice. The same logic works properly in Android. So, The server side code is working properly. But, the iOS streaming conversion is a problem. I used two different sockets (SocketRocket/PockSocket). The problem remains the same in both the sockets.

I have attached my code here. Please let me know if you can help me.

ViewController.h

#import <UIKit/UIKit.h>
#import <AudioToolbox/AudioQueue.h>
#import <AudioToolbox/AudioFile.h>
#import <SocketRocket/SocketRocket.h>

#define NUM_BUFFERS 3
#define SAMPLERATE 16000

//Struct defining recording state
typedef struct {
    AudioStreamBasicDescription dataFormat;
    AudioQueueRef               queue;
    AudioQueueBufferRef         buffers[NUM_BUFFERS];
    AudioFileID                 audioFile;
    SInt64                      currentPacket;
    bool                        recording;
} RecordState;


//Struct defining playback state
typedef struct {
    AudioStreamBasicDescription dataFormat;
    AudioQueueRef               queue;
    AudioQueueBufferRef         buffers[NUM_BUFFERS];
    AudioFileID                 audioFile;
    SInt64                      currentPacket;
    bool                        playing;
} PlayState;

@interface ViewController : UIViewController <SRWebSocketDelegate> {
    RecordState recordState;
    PlayState playState;
    CFURLRef fileURL;
}

@property (nonatomic, strong) SRWebSocket * webSocket;

@property (weak, nonatomic) IBOutlet UITextView *textView;

@end

ViewController.m

#import "ViewController.h"


id thisClass;

//Declare C callback functions
void AudioInputCallback(void * inUserData,  // Custom audio metada
                        AudioQueueRef inAQ,
                        AudioQueueBufferRef inBuffer,
                        const AudioTimeStamp * inStartTime,
                        UInt32 isNumberPacketDescriptions,
                        const AudioStreamPacketDescription * inPacketDescs);

void AudioOutputCallback(void * inUserData,
                         AudioQueueRef outAQ,
                         AudioQueueBufferRef outBuffer);


@interface ViewController ()



@end

@implementation ViewController 

@synthesize webSocket;
@synthesize textView;


// Takes a filled buffer and writes it to disk, "emptying" the buffer
void AudioInputCallback(void * inUserData,
                        AudioQueueRef inAQ, 
                        AudioQueueBufferRef inBuffer,
                        const AudioTimeStamp * inStartTime,
                        UInt32 inNumberPacketDescriptions,
                        const AudioStreamPacketDescription * inPacketDescs)
{
    RecordState * recordState = (RecordState*)inUserData;
    if (!recordState->recording)
    {
        printf("Not recording, returning\n");
    }


    printf("Writing buffer %lld\n", recordState->currentPacket);
    OSStatus status = AudioFileWritePackets(recordState->audioFile,
                                            false,
                                            inBuffer->mAudioDataByteSize,
                                            inPacketDescs,
                                            recordState->currentPacket,
                                            &inNumberPacketDescriptions,
                                            inBuffer->mAudioData);



    if (status == 0)
    {
        recordState->currentPacket += inNumberPacketDescriptions;

        NSData * audioData = [NSData dataWithBytes:inBuffer->mAudioData length:inBuffer->mAudioDataByteSize * NUM_BUFFERS];
        [thisClass sendAudioToSocketAsData:audioData];

    }

    AudioQueueEnqueueBuffer(recordState->queue, inBuffer, 0, NULL);
}

// Fills an empty buffer with data and sends it to the speaker
void AudioOutputCallback(void * inUserData,
                         AudioQueueRef outAQ,
                         AudioQueueBufferRef outBuffer) {
    PlayState * playState = (PlayState *) inUserData;
    if(!playState -> playing) {
        printf("Not playing, returning\n");
        return;
    }

    printf("Queuing buffer %lld for playback\n", playState -> currentPacket);

    AudioStreamPacketDescription * packetDescs;

    UInt32 bytesRead;
    UInt32 numPackets =  SAMPLERATE * NUM_BUFFERS;
    OSStatus status;
    status = AudioFileReadPackets(playState -> audioFile, false, &bytesRead, packetDescs, playState -> currentPacket, &numPackets, outBuffer -> mAudioData);

    if (numPackets) {
        outBuffer -> mAudioDataByteSize = bytesRead;
        status = AudioQueueEnqueueBuffer(playState -> queue, outBuffer, 0, packetDescs);
        playState -> currentPacket += numPackets;
    }else {
        if (playState -> playing) {
            AudioQueueStop(playState -> queue, false);
            AudioFileClose(playState -> audioFile);
            playState -> playing = false;
        }

        AudioQueueFreeBuffer(playState -> queue, outBuffer);
    }

}

- (void) setupAudioFormat:(AudioStreamBasicDescription *) format {


    format -> mSampleRate = SAMPLERATE;
    format -> mFormatID = kAudioFormatLinearPCM;
    format -> mFramesPerPacket = 1;
    format -> mChannelsPerFrame = 1;
    format -> mBytesPerFrame = 2;
    format -> mBytesPerPacket = 2;
    format -> mBitsPerChannel = 16;
    format -> mReserved = 0;
    format -> mFormatFlags =  kLinearPCMFormatFlagIsBigEndian |kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked;

}


- (void)viewDidLoad {
    [super viewDidLoad];
    // Do any additional setup after loading the view, typically from a nib.

     char path[256];
    [self getFilename:path maxLength:sizeof path];

    fileURL = CFURLCreateFromFileSystemRepresentation(NULL, (UInt8*)path, strlen(path), false);


    // Init state variables
    recordState.recording = false;
    thisClass = self;

}

- (void) startRecordingInQueue {
    [self setupAudioFormat:&recordState.dataFormat];

    recordState.currentPacket = 0;

    OSStatus status;

    status = AudioQueueNewInput(&recordState.dataFormat, AudioInputCallback, &recordState, CFRunLoopGetCurrent(), kCFRunLoopCommonModes, 0, &recordState.queue);
    if(status == 0) {
        //Prime recording buffers with empty data
        for (int i=0; i < NUM_BUFFERS; i++) {
            AudioQueueAllocateBuffer(recordState.queue, SAMPLERATE, &recordState.buffers[i]);
            AudioQueueEnqueueBuffer(recordState.queue, recordState.buffers[i], 0, NULL);
        }

        status = AudioFileCreateWithURL(fileURL, kAudioFileAIFFType, &recordState.dataFormat, kAudioFileFlags_EraseFile, &recordState.audioFile);
        if (status == 0) {
            recordState.recording = true;
            status = AudioQueueStart(recordState.queue, NULL);
            if(status == 0) {
                NSLog(@"-----------Recording--------------");
                NSLog(@"File URL : %@", fileURL);
            }
        }
    }

    if (status != 0) {
        [self stopRecordingInQueue];
    }
}

- (void) stopRecordingInQueue {
    recordState.recording = false;
    AudioQueueStop(recordState.queue, true);
    for (int i=0; i < NUM_BUFFERS; i++) {
        AudioQueueFreeBuffer(recordState.queue, recordState.buffers[i]);
    }

    AudioQueueDispose(recordState.queue, true);
    AudioFileClose(recordState.audioFile);
    NSLog(@"---Idle------");
    NSLog(@"File URL : %@", fileURL);


}

- (void) startPlaybackInQueue {
    playState.currentPacket = 0;
    [self setupAudioFormat:&playState.dataFormat];

    OSStatus status;
    status = AudioFileOpenURL(fileURL, kAudioFileReadPermission, kAudioFileAIFFType, &playState.audioFile);
    if (status == 0) {
        status = AudioQueueNewOutput(&playState.dataFormat, AudioOutputCallback, &playState, CFRunLoopGetCurrent(), kCFRunLoopCommonModes, 0, &playState.queue);
        if( status == 0) {
            //Allocate and prime playback buffers
            playState.playing = true;
            for (int i=0; i < NUM_BUFFERS && playState.playing; i++) {
                AudioQueueAllocateBuffer(playState.queue, SAMPLERATE, &playState.buffers[i]);
                AudioOutputCallback(&playState, playState.queue, playState.buffers[i]);
            }

            status = AudioQueueStart(playState.queue, NULL);
            if (status == 0) {
                NSLog(@"-------Playing Audio---------");
            }
        }
    }

    if (status != 0) {
        [self stopPlaybackInQueue];
        NSLog(@"---Playing Audio Failed ------");
    }
}

- (void) stopPlaybackInQueue {
    playState.playing = false;

    for (int i=0; i < NUM_BUFFERS; i++) {
        AudioQueueFreeBuffer(playState.queue, playState.buffers[i]);
    }

    AudioQueueDispose(playState.queue, true);
    AudioFileClose(playState.audioFile);
}

- (IBAction)startRecordingAudio:(id)sender {
    NSLog(@"starting recording tapped");
    [self startRecordingInQueue];
}
- (IBAction)stopRecordingAudio:(id)sender {
    NSLog(@"stop recording tapped");
    [self stopRecordingInQueue];
}


- (IBAction)startPlayingAudio:(id)sender {
    NSLog(@"start playing audio tapped");
    [self startPlaybackInQueue];
}

- (IBAction)stopPlayingAudio:(id)sender {
    NSLog(@"stop playing audio tapped");
    [self stopPlaybackInQueue];
}

- (BOOL) getFilename:(char *) buffer maxLength:(int) maxBufferLength {

    NSArray * paths = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES);
    NSString * docDir = [paths objectAtIndex:0];

    NSString * file = [docDir stringByAppendingString:@"recording.aif"];
    return [file getCString:buffer maxLength:maxBufferLength encoding:NSUTF8StringEncoding];

}


- (void) sendAudioToSocketAsData:(NSData *) audioData {
    [self.webSocket send:audioData];
}

- (IBAction)connectToSocketTapped:(id)sender {
    [self startStreaming];
}

- (void) startStreaming {
    [self connectToSocket];
}

- (void) connectToSocket {
    //Socket Connection Intiliazation

    // create the NSURLRequest that will be sent as the handshake
    NSURLRequest *request = [NSURLRequest requestWithURL:[NSURL URLWithString:@"${url}"]];

    // create the socket and assign delegate

    self.webSocket = [[SRWebSocket alloc] initWithURLRequest:request];

    self.webSocket.delegate = self;

    // open socket
    [self.webSocket open];

}


///--------------------------------------
#pragma mark - SRWebSocketDelegate
///--------------------------------------

- (void)webSocketDidOpen:(SRWebSocket *)webSocket;
{
    NSLog(@"Websocket Connected");

}

- (void) webSocket:(SRWebSocket *)webSocket didFailWithError:(NSError *)error {
    NSLog(@":( Websocket Failed With Error %@", error);
    self.webSocket = nil;
}

- (void) webSocket:(SRWebSocket *)webSocket didReceiveMessage:(id)message {
    NSLog(@"Received \"%@\"", message);

    textView.text = message;    
}

- (void)webSocket:(SRWebSocket *)webSocket didCloseWithCode:(NSInteger)code reason:(NSString *)reason wasClean:(BOOL)wasClean;
{
    NSLog(@"WebSocket closed");
    self.webSocket = nil;
}

- (void)webSocket:(SRWebSocket *)webSocket didReceivePong:(NSData *)pongPayload;
{
    NSLog(@"WebSocket received pong");
}

- (void)didReceiveMemoryWarning {
    [super didReceiveMemoryWarning];
    // Dispose of any resources that can be recreated.
}

Thanks in Advance

I made it work. It was the audio format set up which was causing the problem. I set the audio properly by checking the server side documentation. The Big-Endian was causing problem. If you specify it as big-endian, it is big endian. If you do not specify it, then, it is little-endian. I was in need of little-endian.

- (void) setupAudioFormat:(AudioStreamBasicDescription *) format {


  format -> mSampleRate = 16000.0; //
  format -> mFormatID = kAudioFormatLinearPCM; //
  format -> mFramesPerPacket = 1;
  format -> mChannelsPerFrame = 1; //
  format -> mBytesPerFrame = 2;
  format -> mBytesPerPacket = 2;
  format -> mBitsPerChannel = 16; //
  // format -> mReserved = 0;
  format -> mFormatFlags =  kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked;

}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM