简体   繁体   English

可视化PCM样本量

[英]Visualizing volume of PCM samples

I have several chunks of PCM audio (G.711) in my C++ application. 我的C ++应用程序中有几块PCM音频(G.711)。 I would like to visualize the different audio volume in each of these chunks. 我想将每个块中的不同音频音量可视化。

My first attempt was to calculate the average of the sample values for each chunk and use that as an a volume indicator, but this doesn't work well. 我的第一个尝试是计算每个块的样本值的平均值,并将其用作体积指标,但是效果不佳。 I do get 0 for chunks with silence and differing values for chunks with audio, but the values only differ slighly and don't seem to resemble the actual volume. 对于具有静音的块,我确实得到了0,而对于具有音频的块,我确实得到了不同的值,但这些值仅略有不同,似乎与实际音量并不相似。

What would be a better algorithem calculate the volume ? 什么是更好的算法来计算体积?

I hear G.711 audio is logarithmic PCM. 我听说G.711音频是对数PCM。 How should I take that into account ? 我应该如何考虑呢?

Note, I haven't worked with G.711 PCM audio myself, but I presume that you are performing the correct conversion from the encoded amplitude to an actual amplitude before processing the values. 请注意,我本人还没有使用过G.711 PCM音频,但是我想您在处理这些值之前正在执行从编码幅度到实际幅度的正确转换。

You'd expect the average value of most samples to be approximately zero as sound waveforms oscillate either side of zero. 您可能希望大多数样本的平均值大约为零,因为声音波形在零的任一侧振荡。

A crude volume calculation would be rms (root mean square), ie taking a rolling average of the square of the samples and take the square root of that average. 原始体积的计算将是rms(均方根),即取样品平方的滚动平均值并取该平均值的平方根。 This will give you a postive quantity when there is some sound; 听到声音时,这将给您带来积极的感觉; the quantity is related to the power represented in the waveform. 数量与波形中表示的功率有关。

For something better related to human perception of volume you may want to investigate the sort of techniques used in Replay Gain . 对于与人类对音量的感知更好的关联,您可能需要研究“ 重播增益”中使用的那种技术。

If you're feeling ambitious, you can download G.711 from the ITU-web site, and spend the next few weeks (or maybe more) implementing it. 如果您有雄心壮志,可以从ITU网站下载G.711 ,并在接下来的几周(或更长的时间)内实施它。

If you're lazier (or more sensible) than that, you can download G.191 instead -- it includes source code to compress and decompress G.711 encoded data. 如果您比较懒惰(或更明智),则可以下载G.191 -它包含用于压缩和解压缩G.711编码数据的源代码。

Once you've decoded it, visualizing the volume should be a whole lot easier. 一旦你解码它,可视化的体积要轻松许多。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM