如何使用python绘制频谱或频率与整个音频文件的幅度的关系？

Question

I have some audio files, I want to plot the average spectrum of the audio files like "audacity" software using PYTHON (librosa library). 我有一些音频文件，我想使用PYTHON（librosa库）绘制音频文件的平均频谱，例如“音频”软件。 I can see they are plotting average frequency vs amplitude plot of the entire audio. 我可以看到它们正在绘制整个音频的平均频率与幅度图。

After that, I want to apply CNN to classify two classes of samples. 之后，我想应用CNN对两类样本进行分类。 Looking for suggestions. 寻找建议。

Thank you. 谢谢。

Answer 1

Usually you use librosa.display.specshow to plot spectrograms over time, not over the whole file. 通常，您使用librosa.display.specshow绘制一段时间内的频谱图，而不是整个文件。 In fact, as input for your CNN you might rather use a spectrogram over time as produced by librosa.stft or some Mel spectrogram, depending on what your classification goal is. 实际上，根据您的分类目标，您可能宁愿使用由librosa.stft生成的频谱图或某些Mel频谱图作为CNN的输入。

Eg, if you want to classify for genre, a Mel-spectrogram may be most appropriate. 例如，如果您要对类型进行分类，则梅尔谱图可能是最合适的。 If you want to find out key or chords, you'll need a Constant-Q-spectrogram (CQT), etc. 如果您想找出琴键或和弦，则需要一个恒定Q谱图（CQT）等。

That said, here's some code that answers your question: 也就是说，这是一些可以回答您问题的代码：

import librosa
import numpy as np
import matplotlib.pyplot as plt


file = YOUR_FILE
# load the file
y, sr = librosa.load(file, sr=44100)
# short time fourier transform
# (n_fft and hop length determine frequency/time resolution)
n_fft = 2048
S = librosa.stft(y, n_fft=n_fft, hop_length=n_fft//2)
# convert to db
# (for your CNN you might want to skip this and rather ensure zero mean and unit variance)
D = librosa.amplitude_to_db(np.abs(S), ref=np.max)
# average over file
D_AVG = np.mean(D, axis=1)

plt.bar(np.arange(D_AVG.shape[0]), D_AVG)
x_ticks_positions = [n for n in range(0, n_fft // 2, n_fft // 16)]
x_ticks_labels = [str(sr / 2048 * n) + 'Hz' for n in x_ticks_positions]
plt.xticks(x_ticks_positions, x_ticks_labels)
plt.xlabel('Frequency')
plt.ylabel('dB')
plt.show()

This leads to this output: 这将导致以下输出：

Answer 2

import matplotlib.pyplot as plt
from scipy import signal
from scipy.io import wavfile

sample_rate, samples = wavfile.read('h1.wav')
samples=samples[:,0]
frequencies, times, spectrogram = signal.spectrogram(samples, sample_rate)

plt.imshow(spectrogram)
plt.pcolormesh(times, frequencies, spectrogram)

plt.ylabel('Frequency [Hz]')
plt.xlabel('Time [sec]')
plt.show()

如何使用python绘制频谱或频率与整个音频文件的幅度的关系？

问题描述

2 个解决方案

解决方案1
1 已采纳 2019-04-25 09:16:49

解决方案2
0 2019-04-25 07:19:41

如何使用python绘制频谱或频率与整个音频文件的幅度的关系？

问题描述

2 个解决方案

解决方案1 1 已采纳 2019-04-25 09:16:49

解决方案2 0 2019-04-25 07:19:41

解决方案1
1 已采纳 2019-04-25 09:16:49

解决方案2
0 2019-04-25 07:19:41