简体   繁体   中英

librosa melspectrogram y-axis scale wrong?

I'm trying to figure out why Mel scale spectrogram seems to have the wrong frequency scale. I generate a 4096Hz tone and plot it using librosa's display library, and the tone does not align with the known frequency? I'm obviously doing something wrong, can someone help? Thanks!

import numpy as np
import librosa.display
import matplotlib.pyplot as plt

sr = 44100
t = np.linspace(0, 1, sr)
y = 0.1 * np.sin(2 * np.pi * 4096 * t)

M = librosa.feature.melspectrogram(y=y, sr=sr)
M_db = librosa.power_to_db(M, ref=np.max)
librosa.display.specshow(M_db, y_axis='mel', x_axis='time')
plt.show()

When you compute the mel spectrogram using librosa.feature.melspectrogram(y=y, sr=sr) you implicitly create a mel filter using the parameters fmin=0 and fmax=sr/2 (see docs here ). To correctly plot the spectrogram, librosa.display.specshow needs to know how it was created, ie what sample rate sr was used (to get the time axis right) and what frequency range was used to get the frequency axis right. While librosa.feature.melspectrogram defaults to 0 - sr/2 , librosa.display.specshow unfortunately defaults to 0 - 11050 (see here ). This describes librosa 0.8—I could imagine this changes in the future.

To get this to work correctly, explicitly add fmax parameters. To also get the time axis right, add the sr parameter to librosa.display.specshow :

import numpy as np
import librosa.display
import matplotlib.pyplot as plt

sr = 44100
t = np.linspace(0, 1, sr)
y = 0.1 * np.sin(2 * np.pi * 4096 * t)

M = librosa.feature.melspectrogram(y=y, sr=sr, fmax=sr/2)
M_db = librosa.power_to_db(M, ref=np.max)
librosa.display.specshow(M_db, sr=sr, y_axis='mel', x_axis='time', fmax=sr/2)
plt.show()

正确的梅尔谱图

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM