简体   繁体   English

指定在 Python 中录制音频的最小触发频率

[英]Specify minimum trigger frequency for recording audio in Python

I'm writing a script for sound-activated recording in Python using pyaudio.我正在使用 pyaudio 在 Python 中编写用于声控录音的脚本。 I want to trigger a 5s recording after a sound that is above a prespecified volume and frequency.我想在超过预定音量和频率的声音后触发 5 秒录音。 I've managed to get the volume part working but don't know how to specify the minimum trigger frequency (I'd like it to trigger at frequencies above 10kHz, for example):我已经设法让音量部分工作,但不知道如何指定最小触发频率(例如,我希望它以高于 10kHz 的频率触发):

import pyaudio
import wave
from array import array
import time
 
FORMAT=pyaudio.paInt16
CHANNELS=1
RATE=44100
CHUNK=1024
RECORD_SECONDS=5

audio=pyaudio.PyAudio() 

stream=audio.open(format=FORMAT,channels=CHANNELS, 
                  rate=RATE,
                  input=True,
                  frames_per_buffer=CHUNK)

nighttime=True

while nighttime:
     data=stream.read(CHUNK)
     data_chunk=array('h',data)
     vol=max(data_chunk)
     if(vol>=3000):
         print("recording triggered")
         frames=[]
         for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
             data = stream.read(CHUNK)
             frames.append(data)
         print("recording saved")
         # write to file
         words = ["RECORDING-", time.strftime("%Y%m%d-%H%M%S"), ".wav"]
         FILE_NAME= "".join(words) 
         wavfile=wave.open(FILE_NAME,'wb')
         wavfile.setnchannels(CHANNELS)
         wavfile.setsampwidth(audio.get_sample_size(FORMAT))
         wavfile.setframerate(RATE)
         wavfile.writeframes(b''.join(frames))
         wavfile.close()
     # check if still nighttime
     nighttime=True 
 
 stream.stop_stream()
 stream.close()
 audio.terminate()

I'd like to add to the line if(vol>=3000): something like if(vol>=3000 and frequency>10000): but I don't know how to set up frequency .我想添加到行if(vol>=3000):类似if(vol>=3000 and frequency>10000):但我不知道如何设置frequency How to do this?这该怎么做?

To retrieve the frequency of a signal you can compute Fourier transform, thus switching to frequency domain ( freq in the code).要检索信号的频率,您可以计算傅立叶变换,从而切换到频域(代码中的freq )。 Your next step is to compute relative amplitude of the signal ( amp ) .下一步是计算信号的相对幅度 ( amp )。 The latter is proportional to the sound volume.后者与音量成正比。

spec = np.abs(np.fft.rfft(audio_array))
freq = np.fft.rfftfreq(len(audio_array), d=1 / sampling_freq)
spec = np.abs(spec)
amp = spec / spec.sum()

Mind that 3000 isn't a sound volume either.请注意, 3000也不是音量。 The true sound volume information was lost when the signal was digitalised.当信号被数字化时,真实的音量信息丢失了。 Now you only work with relative numbers, so you can just check if eg 1/3 of energy in a frame is above 10 khz.现在您只使用相对数字,因此您可以检查例如帧中 1/3 的能量是否高于 10 khz。

Here's some code to illustrate the concept:下面是一些代码来说明这个概念:

idx_above_10khz = np.argmax(freq > 10000)
amp_below_10k = amp[:idx_above_10khz].sum()
amp_above_10k = amp[idx_above_10khz:].sum()

Now you could specify that from certain ratio of amp_below_10k / amp_above_10k you should trigger your program.现在您可以指定从一定比例的amp_below_10k / amp_above_10k您应该触发您的程序。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM