简体   繁体   中英

Creating an amplitude vs frequency spectrogram of an audio file in Python

I am trying to create an amplitude vs frequency spectrogram of an audio file in Python. what is the procedure to do so? Some sample code would be of great help.

Simple spectrum

The simplest way to get an amplitude vs. frequency relationship for an evenly sampled signal x is to compute its Discrete Fourier Transform through the efficient Fast Fourier Transform algorithm. Given a signal x sampled at a regular sampling rate fs , you could do this with:

import numpy as np
Xf_mag = np.abs(np.fft.fft(x))

Each index of the Xf_mag array will then contain the amplitude of a frequency bin whose frequency is given by index * fs/len(Xf_mag) . These frequencies can be conveniently computed using:

freqs = np.fft.fftfreq(len(Xf_mag), d=1.0/fs)

Finally the spectrum could be plotted using matplotlib:

import matplotlib.pyplot as plt
plt.plot(freqs, Xf_mag)

Refining the spectrum estimation

You might notice that the spectrum obtained with the simple FFT approach yields a spectrum which appears very noisy (ie. with lots of spikes). To get a more accurate estimate, a more sophisticated approach would be to compute a power spectrum estimate using techniques such as periodograms (implemented by scipy.signal.periodogram ) and Welch's method (implemented by scipy.signal.welch ). Note however that in these cases the computed spectrum is proportional to the square of the amplitudes, so that its square root provide an estimate of the Root-Mean-Squared (RMS) amplitudes.

Going back to the signal x sampled at a regular sampling rate fs , such a power spectrum estimate could thus be obtained as described in the samples from scipy's documentation with the following:

f, Pxx = signal.periodogram(x, fs)
A_rms = np.sqrt(Pxx)

The corresponding frequencies f are also calculated in the process, so you could then plot the result with

plt.plot(f, A_rms)

Using scipy.signal.welch is quite similar, but uses a slightly different implementation which provides a different accuracy/resolution tradeoff.

from scipy import signal
import matplotlib.pyplot as plt
fs = 10e3
N = 1e5
amp = 2 * np.sqrt(2)
noise_power = 0.01 * fs / 2  
time = np.arange(N) / float(fs)
mod = 500*np.cos(2*np.pi*0.25*time)
carrier = amp * np.sin(2*np.pi*3e3*time + mod)
noise = np.random.normal(scale=np.sqrt(noise_power), size=time.shape)
noise *= np.exp(-time/5)
x = carrier + noise
f, t, Sxx = signal.spectrogram(x, fs)
plt.pcolormesh(t, f, Sxx)
plt.ylabel('Frequency [Hz]')
plt.xlabel('Time [sec]')
plt.show()

This is pulled from the scipy documentation as you will need scientific computing to create a spectrogram. install scipy on your machine if you do not have it already and read its documentation:

https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.spectrogram.html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM