How can I extract audio features using MFCC algorithm and use it with the Convolutional Neural Network to train the model?
I have extracted features of audio using MFCC and the file contained floating point columns but I am unable to distinguish between the columns?
for filename in os.listdir(directoryName):
if filename.endswith('.wav'): # only get MFCCs from .wavs
(rate,sig) = wav.read(directoryName + "/" +filename)
mfcc_feat = mfcc(sig,rate)
fbank_feat = logfbank(sig,rate)
outputFile = resultsDirectory + "/" + os.path.splitext(filename)[0] + ".csv"
file = open(outputFile, 'w+')
numpy.savetxt(file, fbank_feat, delimiter=",")
file.close() # close file
The values contained in the csv file like this.
7.01E+00 5.94E+00 5.28E+00 5.25E+00 5.24E+00
5.87E+00 3.53E+00 3.61E+00 2.32E+00 2.13E+00
5.68E+00 8.36E-01 1.75E-01 -8.48E-01 1.77E+00
7.96E+00 6.12E+00 5.47E+00 4.66E+00 4.34E+00
6.29E+00 4.34E+00 3.51E+00 3.15E+00 2.30E+00
6.37E+00 5.34E+00 4.76E+00 3.98E+00 3.77E+00
4.72E+00 1.62E+00 3.09E+00 1.66E+00 1.37E+00
6.14E+00 5.82E+00 5.12E+00 4.11E+00 3.76E+00
7.49E+00 3.79E+00 2.25E+00 5.03E+00 5.69E+00
5.89E+00 4.88E+00 5.88E+00 6.22E+00 6.19E+00
The MFCC features of an audio signal is a time-series. If your input audio is 10 seconds at 44100 kHz and a 1024 samples hop-size (approx 23ms) for the MFCC, then you will get 430 frames, each with MFCC coefficients (maybe 20).
In order to classify this with a Convolutional Neural Network, you need to split it into fixed-size analysis windows of a practical size. For example a 43 MFCC frames window would correspond to approximately 1 second. Input to CNN is then of shape 43x20x1. If you want overlapping analysis windows (can improve performance, at cost of increased compute time) - then jump less than 43 frames ahead when computing the next window.
Here is an answer with example Python code . It is shown for mel-spectrogram, but can be adapted to MFCC by just replacing the call to librosa.feature.melspectrogram()
with librosa.feature.mfcc()
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.