简体   繁体   English

从声音文件中检测频率

[英]Frequency detection from a sound file

What I am trying to achieve is the following: I need the frequency values of a sound file (.wav) for analysis. 我想要实现的目标如下:我需要声音文件(.wav)的频率值进行分析。 I know a lot of programs will give a visual graph (spectrogram) of the values but I need to raw data. 我知道很多程序会给出值的可视图(谱图),但我需要原始数据。 I know this can be done with FFT and should be fairly easily scriptable in python but not sure how to do it exactly. 我知道这可以通过FFT完成,并且应该在python中相当容易编写脚本但不确定如何完全执行它。 So let's say that a signal in a file is .4s long then I would like multiple measurements giving an output as an array for each timepoint the program measures and what value (frequency) it found (and possibly power (dB) too). 因此,假设文件中的信号长度为.4s,那么我希望多次测量,为程序测量的每个时间点以及它找到的值(频率)(以及可能的功率(dB))提供输出作为数组。 The complicated thing is that I want to analyse bird songs, and they often have harmonics or the signal is over a range of frequency (eg 1000-2000 Hz). 复杂的是我想分析鸟歌,它们经常有谐波或信号超出频率范围(例如1000-2000赫兹)。 I would like the program to output this information as well, since this is important for the analysis I would like to do with the data :) 我希望程序也能输出这些信息,因为这对于我想对数据做的分析非常重要:)

Now there is a piece of code that looked very much like I wanted, but I think it does not give me all the values I want.... (thanks to Justin Peel for posting this to a different question :)) So I gather that I need numpy and pyaudio but unfortunately I am not familiar with python so I am hoping that a Python expert can help me on this? 现在有一段看起来非常像我想要的代码,但我认为它并没有给我所有我想要的价值......(感谢Justin Peel将这个问题发布到另一个问题:))所以我聚集在一起我需要numpy和pyaudio但不幸的是我不熟悉python所以我希望Python专家可以帮助我吗?

Source Code: 源代码:

# Read in a WAV and find the freq's
import pyaudio
import wave
import numpy as np

chunk = 2048

# open up a wave
wf = wave.open('test-tones/440hz.wav', 'rb')
swidth = wf.getsampwidth()
RATE = wf.getframerate()
# use a Blackman window
window = np.blackman(chunk)
# open stream
p = pyaudio.PyAudio()
stream = p.open(format =
                p.get_format_from_width(wf.getsampwidth()),
                channels = wf.getnchannels(),
                rate = RATE,
                output = True)

# read some data
data = wf.readframes(chunk)
# play stream and find the frequency of each chunk
while len(data) == chunk*swidth:
    # write data out to the audio stream
    stream.write(data)
    # unpack the data and times by the hamming window
    indata = np.array(wave.struct.unpack("%dh"%(len(data)/swidth),\
                                         data))*window
    # Take the fft and square each value
    fftData=abs(np.fft.rfft(indata))**2
    # find the maximum
    which = fftData[1:].argmax() + 1
    # use quadratic interpolation around the max
    if which != len(fftData)-1:
        y0,y1,y2 = np.log(fftData[which-1:which+2:])
        x1 = (y2 - y0) * .5 / (2 * y1 - y2 - y0)
        # find the frequency and output it
        thefreq = (which+x1)*RATE/chunk
        print "The freq is %f Hz." % (thefreq)
    else:
        thefreq = which*RATE/chunk
        print "The freq is %f Hz." % (thefreq)
    # read some more data
    data = wf.readframes(chunk)
if data:
    stream.write(data)
stream.close()
p.terminate()

I'm not sure if this is what you want, if you just want the FFT: 如果您只是想要FFT,我不确定这是否是您想要的:

import scikits.audiolab, scipy
x, fs, nbits = scikits.audiolab.wavread(filename)
X = scipy.fft(x)

If you want the magnitude response: 如果你想要幅度响应:

import pylab
Xdb = 20*scipy.log10(scipy.absolute(X))
f = scipy.linspace(0, fs, len(Xdb))
pylab.plot(f, Xdb)
pylab.show()

I think that what you need to do is a Short-time Fourier Transform (STFT). 我认为您需要做的是短时傅里叶变换 (STFT)。 Basically, you do multiple partially overlapping FFTs and add them together for each point in time. 基本上,您可以执行多个部分重叠的FFT,并将它们一起添加到每个时间点。 Then you would find the peak for each point in time. 然后你会找到每个时间点的峰值。 I haven't done this myself, but I've looked into it some in the past and this is definitely the way to go forward. 我自己并没有这样做,但我过去一直在研究它,这绝对是前进的方法。

There's some Python code to do a STFT here and here . 这里这里有一些用于执行STFT的Python代码。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM