[英]Specify minimum trigger frequency for recording audio in Python
I'm writing a script for sound-activated recording in Python using pyaudio.我正在使用 pyaudio 在 Python 中编写用于声控录音的脚本。 I want to trigger a 5s recording after a sound that is above a prespecified volume and frequency.
我想在超过预定音量和频率的声音后触发 5 秒录音。 I've managed to get the volume part working but don't know how to specify the minimum trigger frequency (I'd like it to trigger at frequencies above 10kHz, for example):
我已经设法让音量部分工作,但不知道如何指定最小触发频率(例如,我希望它以高于 10kHz 的频率触发):
import pyaudio
import wave
from array import array
import time
FORMAT=pyaudio.paInt16
CHANNELS=1
RATE=44100
CHUNK=1024
RECORD_SECONDS=5
audio=pyaudio.PyAudio()
stream=audio.open(format=FORMAT,channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)
nighttime=True
while nighttime:
data=stream.read(CHUNK)
data_chunk=array('h',data)
vol=max(data_chunk)
if(vol>=3000):
print("recording triggered")
frames=[]
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK)
frames.append(data)
print("recording saved")
# write to file
words = ["RECORDING-", time.strftime("%Y%m%d-%H%M%S"), ".wav"]
FILE_NAME= "".join(words)
wavfile=wave.open(FILE_NAME,'wb')
wavfile.setnchannels(CHANNELS)
wavfile.setsampwidth(audio.get_sample_size(FORMAT))
wavfile.setframerate(RATE)
wavfile.writeframes(b''.join(frames))
wavfile.close()
# check if still nighttime
nighttime=True
stream.stop_stream()
stream.close()
audio.terminate()
I'd like to add to the line if(vol>=3000):
something like if(vol>=3000 and frequency>10000):
but I don't know how to set up frequency
.我想添加到行
if(vol>=3000):
类似if(vol>=3000 and frequency>10000):
但我不知道如何设置frequency
。 How to do this?这该怎么做?
To retrieve the frequency of a signal you can compute Fourier transform, thus switching to frequency domain ( freq
in the code).要检索信号的频率,您可以计算傅立叶变换,从而切换到频域(代码中的
freq
)。 Your next step is to compute relative amplitude of the signal ( amp
) .下一步是计算信号的相对幅度 (
amp
)。 The latter is proportional to the sound volume.后者与音量成正比。
spec = np.abs(np.fft.rfft(audio_array))
freq = np.fft.rfftfreq(len(audio_array), d=1 / sampling_freq)
spec = np.abs(spec)
amp = spec / spec.sum()
Mind that 3000
isn't a sound volume either.请注意,
3000
也不是音量。 The true sound volume information was lost when the signal was digitalised.当信号被数字化时,真实的音量信息丢失了。 Now you only work with relative numbers, so you can just check if eg 1/3 of energy in a frame is above 10 khz.
现在您只使用相对数字,因此您可以检查例如帧中 1/3 的能量是否高于 10 khz。
Here's some code to illustrate the concept:下面是一些代码来说明这个概念:
idx_above_10khz = np.argmax(freq > 10000)
amp_below_10k = amp[:idx_above_10khz].sum()
amp_above_10k = amp[idx_above_10khz:].sum()
Now you could specify that from certain ratio of amp_below_10k / amp_above_10k
you should trigger your program.现在您可以指定从一定比例的
amp_below_10k / amp_above_10k
您应该触发您的程序。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.