简体   繁体   中英

Issues with scipy.io fft and ifft

I'm trying to apply machine learning algorithms on raw audio. My training would be on the Fourier coefficient of the audio signal. I was trying to get those and apply ifft to get my audio back but it doesn't work with my implementation, which is :

fs, data = wavfile.read('dataset piano/wav/music (1).wav')
Te = 0.25
T = 40

a = data.T[0] #retrieve first channel
#put the information in a matrix, one row will contain the fourier coefficients of 0.25s of music.
#The whole matrix, which has 40 rows will contain information of 10s of the wav file.
X = np.array([fft(a[int(i*fs*Te):int((i+1)*fs*Te)]) for i in range(T)])
Z = ifft(X.flatten())
Z = Z.astype(data.dtype)

wavfile.write('test3.wav',fs,Z)

Normally it should play the first 10s of the wav file but it doesn't and I really don't understand why. All I get is a high-pitched sound. I am using the fft and ifft from scipy.

You were very close. Just change

Z = ifft(X.flatten())

to

Z = ifft(X).flatten()

What you are doing is computing an inverse Fourier transform on a concatenation of spectra, which really makes no sense. I think what you rather want to do, is concatenate inverse Fourier transform on spectra. This is what I have done and managed to reconstitute a signal that sounds well.

ifft(X) will run an IFFT on every array along the last dimension, which is the spectrum dimension in your case, and return an array of the same shape (40, 11025). Then flatten will concatenate every row, making an sensible signal.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM