简体   繁体   English

如何使用python绘制频谱或频率与整个音频文件的幅度的关系?

[英]How to plot spectrum or frequency vs amplitude of entire audio file using python?

I have some audio files, I want to plot the average spectrum of the audio files like "audacity" software using PYTHON (librosa library). 我有一些音频文件,我想使用PYTHON(librosa库)绘制音频文件的平均频谱,例如“音频”软件。 I can see they are plotting average frequency vs amplitude plot of the entire audio. 我可以看到它们正在绘制整个音频的平均频率与幅度图。

在此处输入图片说明

After that, I want to apply CNN to classify two classes of samples. 之后,我想应用CNN对两类样本进行分类。 Looking for suggestions. 寻找建议。

Thank you. 谢谢。

Usually you use librosa.display.specshow to plot spectrograms over time, not over the whole file. 通常,您使用librosa.display.specshow绘制一段时间内的频谱图,而不是整个文件。 In fact, as input for your CNN you might rather use a spectrogram over time as produced by librosa.stft or some Mel spectrogram, depending on what your classification goal is. 实际上,根据您的分类目标,您可能宁愿使用由librosa.stft生成的频谱图或某些Mel频谱图作为CNN的输入。

Eg, if you want to classify for genre, a Mel-spectrogram may be most appropriate. 例如,如果您要对类型进行分类,则梅尔谱图可能是最合适的。 If you want to find out key or chords, you'll need a Constant-Q-spectrogram (CQT), etc. 如果您想找出琴键或和弦,则需要一个恒定Q谱图(CQT)等。

That said, here's some code that answers your question: 也就是说,这是一些可以回答您问题的代码:

import librosa
import numpy as np
import matplotlib.pyplot as plt


file = YOUR_FILE
# load the file
y, sr = librosa.load(file, sr=44100)
# short time fourier transform
# (n_fft and hop length determine frequency/time resolution)
n_fft = 2048
S = librosa.stft(y, n_fft=n_fft, hop_length=n_fft//2)
# convert to db
# (for your CNN you might want to skip this and rather ensure zero mean and unit variance)
D = librosa.amplitude_to_db(np.abs(S), ref=np.max)
# average over file
D_AVG = np.mean(D, axis=1)

plt.bar(np.arange(D_AVG.shape[0]), D_AVG)
x_ticks_positions = [n for n in range(0, n_fft // 2, n_fft // 16)]
x_ticks_labels = [str(sr / 2048 * n) + 'Hz' for n in x_ticks_positions]
plt.xticks(x_ticks_positions, x_ticks_labels)
plt.xlabel('Frequency')
plt.ylabel('dB')
plt.show()

This leads to this output: 这将导致以下输出:

dB的频率

import matplotlib.pyplot as plt
from scipy import signal
from scipy.io import wavfile

sample_rate, samples = wavfile.read('h1.wav')
samples=samples[:,0]
frequencies, times, spectrogram = signal.spectrogram(samples, sample_rate)

plt.imshow(spectrogram)
plt.pcolormesh(times, frequencies, spectrogram)

plt.ylabel('Frequency [Hz]')
plt.xlabel('Time [sec]')
plt.show()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM