简体   繁体   English

Python频率检测

[英]Python frequency detection

Ok what im trying to do is a kind of audio processing software that can detect a prevalent frequency an if the frequency is played for long enough (few ms) i know i got a positive match.好吧,我想做的是一种音频处理软件,它可以检测流行频率,如果频率播放时间足够长(几毫秒),我知道我得到了肯定的匹配。 i know i would need to use FFT or something simiral but in this field of math i suck, i did search the internet but didn not find a code that could do only this.我知道我需要使用 FFT 或类似的东西,但在这个数学领域我很烂,我确实在互联网上搜索过,但没有找到只能做到这一点的代码。

the goal im trying to accieve is to make myself a custom protocol to send data trough sound, need very low bitrate per sec (5-10bps) but im also very limited on the transmiting end so the recieving software will need to be able custom (cant use an actual hardware/software modem) also i want this to be software only (no additional hardware except soundcard)我试图实现的目标是让自己成为一个自定义协议来发送数据槽声音,每秒需要非常低的比特率(5-10bps),但我在传输端也非常有限,因此接收软件需要能够自定义(不能使用实际的硬件/软件调制解调器)我也希望这只是软件(除了声卡之外没有其他硬件)

thanks alot for the help.非常感谢您的帮助。

The aubio libraries have been wrapped with SWIG and can thus be used by Python. aubio库已经用 SWIG 封装,因此可以被 Python 使用。 Among their many features include several methods for pitch detection/estimation including the YIN algorithm and some harmonic comb algorithms.它们的许多功能包括几种音高检测/估计方法,包括YIN算法和一些谐波梳算法。

However, if you want something simpler, I wrote some code for pitch estimation some time ago and you can take it or leave it.但是,如果你想要更简单的东西,我前段时间写了一些音高估计的代码,你可以接受也可以放弃。 It won't be as accurate as using the algorithms in aubio, but it might be good enough for your needs.它不会像使用 aubio 中的算法那样准确,但它可能足以满足您的需求。 I basically just took the FFT of the data times a window (a Blackman window in this case), squared the FFT values, found the bin that had the highest value, and used a quadratic interpolation around the peak using the log of the max value and its two neighboring values to find the fundamental frequency.我基本上只是将数据的 FFT 乘以一个窗口(在本例中为 Blackman 窗口),对 FFT 值求平方,找到具有最高值的 bin,并使用最大值的对数在峰值周围使用二次插值及其两个相邻值以找到基频。 The quadratic interpolation I took from some paper that I found.我从我找到的一些论文中获取的二次插值。

It works fairly well on test tones, but it will not be as robust or as accurate as the other methods mentioned above.它在测试音调上工作得相当好,但它不会像上面提到的其他方法那样稳健或准确。 The accuracy can be increased by increasing the chunk size (or reduced by decreasing it).可以通过增加块大小(或通过减小块大小来减小)来提高准确性。 The chunk size should be a multiple of 2 to make full use of the FFT.块大小应为 2 的倍数以充分利用 FFT。 Also, I am only determining the fundamental pitch for each chunk with no overlap.此外,我只确定每个块的基本音高,没有重叠。 I used PyAudio to play the sound through while writing out the estimated pitch.我使用 PyAudio 在写出估计音高的同时播放声音。

Source Code:源代码:

# Read in a WAV and find the freq's
import pyaudio
import wave
import numpy as np

chunk = 2048

# open up a wave
wf = wave.open('test-tones/440hz.wav', 'rb')
swidth = wf.getsampwidth()
RATE = wf.getframerate()
# use a Blackman window
window = np.blackman(chunk)
# open stream
p = pyaudio.PyAudio()
stream = p.open(format =
                p.get_format_from_width(wf.getsampwidth()),
                channels = wf.getnchannels(),
                rate = RATE,
                output = True)

# read some data
data = wf.readframes(chunk)
# play stream and find the frequency of each chunk
while len(data) == chunk*swidth:
    # write data out to the audio stream
    stream.write(data)
    # unpack the data and times by the hamming window
    indata = np.array(wave.struct.unpack("%dh"%(len(data)/swidth),\
                                         data))*window
    # Take the fft and square each value
    fftData=abs(np.fft.rfft(indata))**2
    # find the maximum
    which = fftData[1:].argmax() + 1
    # use quadratic interpolation around the max
    if which != len(fftData)-1:
        y0,y1,y2 = np.log(fftData[which-1:which+2:])
        x1 = (y2 - y0) * .5 / (2 * y1 - y2 - y0)
        # find the frequency and output it
        thefreq = (which+x1)*RATE/chunk
        print "The freq is %f Hz." % (thefreq)
    else:
        thefreq = which*RATE/chunk
        print "The freq is %f Hz." % (thefreq)
    # read some more data
    data = wf.readframes(chunk)
if data:
    stream.write(data)
stream.close()
p.terminate()

如果您打算使用FSK(频移键控)对数据进行编码,则最好使用Goertzel 算法,这样您就可以只检查所需的频率,而不是完整的 DFT/FFT。

You can find the frequency spectrum of the sliding windows over your sound from here and then check the presence of the prevalent frequency band via finding the area under the frequency spectrum curve for that band from here .您可以找到滑动窗口的在你的声音的频谱在这里,然后通过从该频段频谱曲线下寻找面积检查流行频段存在这里

import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import auc
np.random.seed(0)

# Sine sample with a frequency of 5hz and add some noise
sr = 32  # sampling rate
y = np.linspace(0, 5 * 2*np.pi, sr)
y = np.tile(np.sin(y), 5)
y += np.random.normal(0, 1, y.shape)
t = np.arange(len(y)) / float(sr)

# Generate frquency spectrum
spectrum, freqs, _ = plt.magnitude_spectrum(y, sr)

# Calculate percentage for a frequency range 
lower_frq, upper_frq = 4, 6
ind_band = np.where((freqs > lower_frq) & (freqs < upper_frq))
plt.fill_between(freqs[ind_band], spectrum[ind_band], color='red', alpha=0.6)
frq_band_perc = auc(freqs[ind_band], spectrum[ind_band]) / auc(freqs, spectrum)
print('{:.1%}'.format(frq_band_perc))
# 19.8%

在此处输入图片说明

While I haven't tried audio processing with Python before, perhaps you could build something based on SciPy (or its subproject NumPy), a framework for efficient scientific/engineering numerical computation?虽然我之前没有尝试过使用 Python 进行音频处理,但也许您可以基于SciPy (或其子项目 NumPy)构建一些东西,这是一个高效的科学/工程数值计算框架? You might start by looking at scipy.fftpack for your FFT.您可以先查看scipy.fftpack以获得 FFT。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM