简体   繁体   中英

Remove initial silence from recorded audio file of wave type

Can anyone help me out removing the initial silence in recorded audio file?

I am fetching the data bytes of wav file and after ignoring first 44 header bytes getting the end range of 0 bytes which are silent in wave file.

After that from total data bytes, end range of silent audio bytes and total duration of file, I am calculating the silence time of audio file and trimming that much time from audio file.

But the issue is still there is some silent part remaining in audio file.

So not sure if I missed something?

- (double)processAudio:(float)totalFileDuration withFilePathURL:(NSURL *)filePathURL{
    NSMutableData *data = [NSMutableData dataWithContentsOfURL:filePathURL];
    NSMutableData *Wave1= [NSMutableData dataWithData:[data subdataWithRange:NSMakeRange(44, [data length] - 44)]];
    uint8_t * bytePtr = (uint8_t  * )[Wave1 bytes] ;
    NSInteger totalData = [Wave1 length] / sizeof(uint8_t);
    int endRange = 0;
    for (int i = 0 ; i < totalData; i ++){
           /
        if (bytePtr[i] == 0) {
            endRange = i;
        }else
            break;
    }

    double silentAudioDuration =(((float)endRange/(float)totalData)*totalFileDuration);
    return silentAudioDuration;
}
- (void)trimAudioFileWithInputFilePath :(NSString *)inputPath toOutputFilePath : (NSString *)outputPath{
    /
    NSString *strInputFilePath = inputPath;
    NSURL *audioFileInput = [NSURL fileURLWithPath:strInputFilePath];

    /
    NSString *strOutputFilePath = [outputPath stringByDeletingPathExtension];
    strOutputFilePath = [strOutputFilePath stringByAppendingString:@".m4a"];
    NSURL *audioFileOutput = [NSURL fileURLWithPath:strOutputFilePath];
    newPath = strOutputFilePath;

    if (!audioFileInput || !audioFileOutput){
        /
    }

    [[NSFileManager defaultManager] removeItemAtURL:audioFileOutput error:NULL];
    AVAsset *asset = [AVAsset assetWithURL:audioFileInput];
    CMTime audioDuration = asset.duration;
    float audioDurationSeconds = CMTimeGetSeconds(audioDuration);

    AVAssetExportSession *exportSession = [AVAssetExportSession exportSessionWithAsset:asset presetName:AVAssetExportPresetAppleM4A];

    if (exportSession == nil){
        /
    }

    /
    float startTrimTime = [self processAudio:audioDurationSeconds withFilePathURL:audioFileInput];
    /
    /
    float endTrimTime = audioDurationSeconds;

    recordingDuration = audioDurationSeconds - startTrimTime;

    CMTime startTime = CMTimeMake((int)(floor(startTrimTime * 100)), 100);
    CMTime stopTime = CMTimeMake((int)(ceil(endTrimTime * 100)), 100);
    CMTimeRange exportTimeRange = CMTimeRangeFromTimeToTime(startTime, stopTime);

    exportSession.outputURL = audioFileOutput;
    exportSession.outputFileType = AVFileTypeAppleM4A;
    exportSession.timeRange = exportTimeRange;

    [exportSession exportAsynchronouslyWithCompletionHandler:^{
         if (AVAssetExportSessionStatusCompleted == exportSession.status){
         }
         else if (AVAssetExportSessionStatusFailed == exportSession.status){
         }
     }];
}

What am I doing wrong here?

It is possible that you don't have complete silence in your files? Perhaps your sample has a value of 1 or 2 or 3 which technically is not silent but it is very quiet.

Wave files are stored as signed numbers if 16 bits and unsigned if 8 bits. You are processing and casting your data to be an unsigned byte: uint8_t * bytePtr = (uint8_t * )[Wave1 bytes] ;

You need to know the format of your wave file which can be obtained from the header. (It might use sample sizes of say 8 bit, 16 bit, 24 bit, etc.)

If it is 16 bits and mono, you need to use:

int16_t * ptr = (int16_t) [Wave1 bytes];

Your loop counts one byte at a time so you would need to adjust it to increment by the size of your frame size.

You also don't consider mono/stereo.
In general, your processAudio function needs more details and should consider the number of channels per frame (stereo/mono) and the size of the sample size.

Here is a wave header with iOS types. You can cast the first 44 bytes and get the header data so you know what you are dealing with.

typedef struct waveHeader_t
{
    //RIFF
    char        chunkID[4];             ///< Should always contain "RIFF" BigEndian    //4
    uint32_t    chunkSize;              ///< total file length minus 8  (little endian!!!)    //4
    char        format[4];              ///< should be "WAVE"  Big Endian

    // fmt
    char        subChunk1ID[4];         ///< "fmt " Big Endian                //4
    uint32_t    subChunk1Size;          ///< 16 for PCM format                        //2
    uint16_t    audioFormat;            ///< 1 for PCM format                       //2
    uint16_t    numChannels;            ///< channels                                     //2
    uint32_t    sampleRate;             ///< sampling frequency                           //4
    uint32_t    byteRate;               ///< samplerate * numchannels * bitsperSample/8
    uint16_t    blockAlign;             ///< frame size
    uint16_t    bitsPerSample;          ///< bits per Sample

    char        subChunk2ID[4];         ///< should always contain "data"
    uint32_t    subChunk2Size;          ///< 

    ///< sample data follows this.....
} waveHeader_t;

So your todo list is

  • Extract the fields from the header
  • Specifically get number of channels and bits per channel (note **BITS per channel)
  • Point to the data with the appropriate size pointer and loop through one frame at a time. (A mono frame has one sample that is could be 8, 16, 24 etc bits. A stereo frame has two samples that could be 8, 16, or 24 bits per sample. eg LR LR LR LR LR LR would be 6 frames)

The header of an Apple generated wave file is usually not 44 bytes in length. Some Apple generated headers are 4k bytes in length. You have to inspect the wave RIFF header for extra 'FFLR' bytes. If you don't skip past this extra filler padding, you will end up with about an extra tenth of a second in silence (or potentially even bad data).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM