简体繁体 English

获取声音文件中给定时间的振幅？

[英]Get the amplitude at a given time within a sound file?

原文 2009-04-12 22:36:30 3 3 python/ audio/ input/ microphone/ amplitude

I'm working on a project where I need to know the amplitude of sound coming in from a microphone on a computer. 我正在做一个项目，我需要知道计算机上麦克风传来的声音振幅。

I'm currently using Python with the Snack Sound Toolkit and I can record audio coming in from the microphone, but I need to know how loud that audio is. 我目前正在将Python与Snack Sound Toolkit配合使用，并且可以记录来自麦克风的音频，但是我需要知道该音频的音量。 I could save the recording to a file and use another toolkit to read in the amplitude at given points in time from the audio file, or try and get the amplitude while the audio is coming in (which could be more error prone). 我可以将录音保存到文件中，并使用另一个工具包从音频文件中的给定时间点读取幅度，或者尝试在音频进入时获取幅度（这更容易出错）。

Are there any libraries or sample code that can help me out with this? 是否有任何库或示例代码可以帮助我解决这个问题？ I've been looking and so far the Snack Sound Toolkit seems to be my best hope, yet there doesn't seem to be a way to get direct access to amplitude. 我一直在寻找，到目前为止，Snack Sound Toolkit似乎是我最大的希望，但似乎还没有办法直接获得振幅。

3 个解决方案

Looking at the Snack Sound Toolkit examples, there seems to be a dbPowerSpectrum function. 查看Snack Sound Toolkit示例，似乎有一个dbPowerSpectrum函数。

From the reference: 从参考：

dBPowerSpectrum ( ) dBPowerSpectrum（）

Computes the log FFT power spectrum of the sound (at the sample number given in the start option) and returns a list of dB values. 计算声音的对数FFT功率谱（以start选项中指定的样本数），并返回dB值列表。 See the section item for a description of the rest of the options. 有关其他选项的说明，请参见本节。 Optionally an ending point can be given, using the end option. 可以选择使用end选项指定终点。 In this case the result is the average of consecutive FFTs in the specified range. 在这种情况下，结果是指定范围内连续FFT的平均值。 Their default spacing is taken from the fftlength but this can be changed using the skip option, which tells how many points to move the FFT window each step. 它们的默认间隔是从fftlength中获取的，但是可以使用skip选项更改该间隔，该选项可告诉您每步移动FFT窗口多少个点。 Options: 选项：

EDIT: I am assuming when you say amplitude, you mean how "loud" the sound appears to a human, and not the time domain voltage(Which would probably be 0 throughout the entire length since the integral of sine waves is going to be 0. eg: 10 * sin(t) is louder than 5 * sin(t), but their average value over time is 0. (You do not want to send non-AC voltages to a speaker anyways)). 编辑：我假设当您说振幅时，您是指声音对人的感觉有多“响亮”，而不是时域电压（由于正弦波的积分将为0，因此在整个长度上可能为0）例如：10 * sin（t）大于5 * sin（t），但它们的平均值随时间变化为0（无论如何，您都不希望向扬声器发送非AC电压）。

To get how loud the sound is, you will need to determine the amplitudes of each frequency component. 要获得声音的声音，您需要确定每个频率分量的幅度。 This is done with a Fourier Transform (FFT), which breaks down the sound into it's frequency components. 这是通过傅立叶变换（FFT）来完成的，它可以将声音分解为频率成分。 The dbPowerSpectrum function seems to give you a list of the magnitudes (forgive me if this differs from the exact definition of a power spectrum) of each frequency. dbPowerSpectrum函数似乎为您提供了每个频率的幅度列表（如果这与功率谱的确切定义不同，请原谅）。 To get the total volume, you can just sum the entire list (Which will be close, xept it still might be different from percieved loudness since the human ear has a frequency response itself). 要获得总音量，您可以将整个列表加起来（这将很接近，因为人耳本身具有频率响应，所以它可能与感知的响度有所不同）。

I disagree completely with this "answer" from CookieOfFortune. 我完全不同意CookieOfFortune的这个“答案”。

granted, the question is poorly phrased... but this answer is making things much more complex than necessary. 当然，这个问题的措辞很差……但是这个答案使事情变得比必要的复杂得多。 I am assuming that by 'amplitude' you mean perceived loudness. 我假设“振幅”是指感知到的响度。 as technically each sample in the (PCM) audio stream represents an amplitude of the signal at a given time-slice. 从技术上讲，（PCM）音频流中的每个样本都代表给定时间片段的信号幅度。 to get a loudness representation try a simple RMS calculation: 要获得响度表示，请尝试简单的RMS计算：

RMS RMS

|K< | K <

I'm not sure if this will help, but skimpygimpy provides facilities for parsing WAVE files into python sequences and back -- you could potentially use this to examine the wave form samples directly and do what you like. 我不确定这是否会有所帮助，但是Skimpygimpy提供了将WAVE文件解析为python序列并返回的工具，您可以使用它直接检查波形样本并做自己喜欢的事情。 You will have to read some source, these subcomponents are not documented. 您将必须阅读一些资料，这些子组件没有记录在案。