简体   繁体   English

通过python从音频文件中提取音频频谱

[英]Audio spectrum extraction from audio file by python

Sorry if I submit a duplicate, but I wonder if there is any lib in python which makes you able to extract sound spectrum from audio files.抱歉,如果我提交了一个副本,但我想知道 python 中是否有任何库可以让您从音频文件中提取声谱。 I want to be able to take an audio file and write an algoritm which will return a set of data {TimeStampInFile;我希望能够获取一个音频文件并编写一个算法来返回一组数据 {TimeStampInFile; Frequency-Amplitude}.频率-幅度}。

I heard that this is usually called Beat Detection, but as far as I see beat detection is not a precise method, it is good only for visualisation, while I want to manipulate on the extracted data and then convert it back to an audio file.我听说这通常称为节拍检测,但据我所知,节拍检测不是一种精确的方法,它仅适用于可视化,而我想对提取的数据进行操作,然后将其转换回音频文件。 I don't need to do this real-time.我不需要实时执行此操作。

I will appreciate any suggestions and recommendations.我将不胜感激任何建议和建议。

You can compute and visualize the spectrum and the spectrogram this using scipy, for this test i used this audio file: vignesh.wav您可以使用 scipy 计算和可视化频谱和频谱图,对于这个测试,我使用了这个音频文件: vignesh.wav

from scipy.io import wavfile # scipy library to read wav files
import numpy as np

AudioName = "vignesh.wav" # Audio File
fs, Audiodata = wavfile.read(AudioName)

# Plot the audio signal in time
import matplotlib.pyplot as plt
plt.plot(Audiodata)
plt.title('Audio signal in time',size=16)

# spectrum
from scipy.fftpack import fft # fourier transform
n = len(Audiodata) 
AudioFreq = fft(Audiodata)
AudioFreq = AudioFreq[0:int(np.ceil((n+1)/2.0))] #Half of the spectrum
MagFreq = np.abs(AudioFreq) # Magnitude
MagFreq = MagFreq / float(n)
# power spectrum
MagFreq = MagFreq**2
if n % 2 > 0: # ffte odd 
    MagFreq[1:len(MagFreq)] = MagFreq[1:len(MagFreq)] * 2
else:# fft even
    MagFreq[1:len(MagFreq) -1] = MagFreq[1:len(MagFreq) - 1] * 2 

plt.figure()
freqAxis = np.arange(0,int(np.ceil((n+1)/2.0)), 1.0) * (fs / n);
plt.plot(freqAxis/1000.0, 10*np.log10(MagFreq)) #Power spectrum
plt.xlabel('Frequency (kHz)'); plt.ylabel('Power spectrum (dB)');


#Spectrogram
from scipy import signal
N = 512 #Number of point in the fft
f, t, Sxx = signal.spectrogram(Audiodata, fs,window = signal.blackman(N),nfft=N)
plt.figure()
plt.pcolormesh(t, f,10*np.log10(Sxx)) # dB spectrogram
#plt.pcolormesh(t, f,Sxx) # Lineal spectrogram
plt.ylabel('Frequency [Hz]')
plt.xlabel('Time [seg]')
plt.title('Spectrogram with scipy.signal',size=16);

plt.show()

i tested all the code and it works, you need, numpy, matplotlib and scipy.我测试了所有代码,它可以工作,你需要,numpy、matplotlib 和 scipy。

cheers干杯

I think your question has three separate parts:我认为你的问题分为三个独立的部分:

  1. How to load audio files into python?如何将音频文件加载到python中?
  2. How to calculate spectrum in python?如何在python中计算频谱?
  3. What to do with the spectrum?频谱怎么办?

1. How to load audio files in python? 1.如何在python中加载音频文件?

You are probably best off by using scipy , as it provides a lot of signal processing functions.您可能最好使用scipy ,因为它提供了许多信号处理功能。 For loading audio files:加载音频文件:

import scipy.io.wavfile

samplerate, data = scipy.io.wavfile.read("mywav.wav")

Now you have the sample rate (samples/s) in samplerate and data as a numpy.array in data .现在,你必须在采样率(样本/秒), samplerate和数据作为numpy.arraydata You may want to transform the data into floating point, depending on your application.您可能希望将数据转换为浮点数,具体取决于您的应用程序。

There is also a standard python module wave for loading wav-files, but numpy / scipy offers a simpler interface and more options for signal processing.还有一个标准的 python 模块wave用于加载 wav 文件,但numpy / scipy提供了一个更简单的界面和更多的信号处理选项。

2. How to calculate the spectrum 2. 如何计算频谱

Brief answer: Use FFT.简答:使用FFT。 For more words of wisdom, see:更多智慧之言,请见:

Analyze audio using Fast Fourier Transform 使用快速傅立叶变换分析音频

Longer answer is quite long.更长的答案很长。 Windowing is very important, otherwise you'll have strange spectra.窗口化非常重要,否则你会得到奇怪的光谱。

3. What to do with the spectrum 3.如何处理频谱

This is a bit more difficult.这有点困难。 Filtering is often performed in time domain for longer signals.对于较长的信号,通常在时域中执行滤波。 Maybe if you tell us what you want to accomplish, you'll receive a good answer for this one.也许如果你告诉我们你想完成什么,你会得到一个很好的答案。 Calculating the frequency spectrum is one thing, getting meaningful results with it in signal processing is a bit more complicated.计算频谱是一回事,在信号处理中用它获得有意义的结果要复杂一些。

(I know you did not ask this one, but I see it coming with a probability >> 0. Of course, it may be that you have good knowledge on audio signal processing, in which case this is irrelevant.) (我知道你没有问这个,但我看到它出现的概率是 >> 0。当然,可能你对音频信号处理有很好的了解,在这种情况下,这无关紧要。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM