简体   繁体   English

Python FFT 音频文件

[英]Python FFT an audio file

I'm trying to plot a magnitude-frequency spectrum from a wav file, the sample rate of the file is 44.1KHz, I only want to compute FFT of the first 100 samples, for that I'm using np.fft.fft() However I am getting unexpected results, see image 1.我正在尝试从 wav 文件绘制幅度频谱,文件的采样率为 44.1KHz,我只想计算前 100 个样本的 FFT,为此我使用的是np.fft.fft()但是我得到了意想不到的结果,请参见图 1。

在此处输入图片说明

I am only getting expected results when I'm computing FFT of at least 2048 samples.当我计算至少 2048 个样本的 FFT 时,我只会得到预期的结果。 why?为什么?

Here is my code:这是我的代码:

import numpy as np
import matplotlib.pyplot as plt

def normalizeAudio(data):
    return np.float32((data / max(data)))

SAMPLE_FOR = 1 # in seconds
samplerate, data = scipy.io.wavfile.read(r'Recording.wav')
data = normalizeAudio(data[0:int(samplerate*SAMPLE_FOR)])


fft_out = np.fft.fft(data[0:100])
freq_vector = np.arange(0, 44100, 44100 / 100)
plt.plot(freq_vector, np.abs(fft_out))
plt.show()

Have a look into librosa , it's a very good library for audio analysis in python, including chromatographs, spectograms, percussion graphs and other cool stuff.看看librosa ,它是一个非常好的 Python 音频分析库,包括色谱图、频谱图、敲击图和其他很酷的东西。

It's very well documented with examples and plenty others on stack overflow so I won't copy them here.关于堆栈溢出的示例和其他很多内容都有很好的记录,所以我不会在这里复制它们。

Additionally, most applications I've seen tend to use STFT, short-term fourier transform (aka DFT, Discrete Fourier Transform).此外,我见过的大多数应用程序都倾向于使用 STFT,短期傅立叶变换(又名 DFT,离散傅立叶变换)。 This is much faster still than plain FFT and more useful for modelling and things to control input shape and also removes a lot of the noise that you can get with FFT as it takes windows rather than instantaneous changes.这仍然比普通 FFT 快得多,并且对于建模和控制输入形状的事情更有用,并且还消除了使用 FFT 可以获得的大量噪声,因为它需要窗口而不是瞬时变化。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM