简体   繁体   English

speex分裂音频数据 - WebAudio - VOIP

[英]speex splitted audio data - WebAudio - VOIP

Im running a little app that encodes and decodes an audio array with the speex codec in javascript: https://github.com/dbieber/audiorecorder 我正在运行一个小应用程序,用javascript中的speex编解码器对音频数组进行编码和解码: https//github.com/dbieber/audiorecorder

with a small array filled with a sin waveform 一个充满sin波形的小阵列

for(var i=0;i<16384;i++)
    data.push(Math.sin(i/10));

this works. 这很有效。 But I want to build a VOIP application and have more than one array. 但我想构建一个VOIP应用程序并拥有多个阵列。 So if I split my array up in 2 parts encode>decode>merge, it doesn't sound the same as before. 因此,如果我将数组拆分为2部分编码>解码>合并,则听起来与以前不同。

Take a look at this: 看看这个:

fiddle: http://jsfiddle.net/exh63zqL/ 小提琴: http//jsfiddle.net/exh63zqL/

Both buttons should give the same audio output. 两个按钮应该提供相同的音频输出。

How can i get the same output in both ways ? 如何以两种方式获得相同的输出? Is their a special mode in speex.js for split audio data? 它们是speex.js中的特殊模式,用于分割音频数据吗?

Note that Speex is a lossy codec . 请注意,Speex是一个有损编解码器 So, by definition, it can't give same result as the encoded buffer. 因此,根据定义,它不能给出与编码缓冲区相同的结果。 Besides, it designed to be a codec for voice. 此外,它旨在成为语音编解码器。 So the 1-2 kHz range will be the most efficient as it expects a specific form of signal. 因此,1-2 kHz范围将是最有效的,因为它期望特定形式的信号。 In some way, it can be compared to JPEG technology for raster images. 在某种程度上,它可以与JPEG技术相比较,用于光栅图像。

I've modified slightly your jsfiddle example so you can play with different parameters and compare results. 我稍微修改了你的jsfiddle示例,这样你就可以使用不同的参数并比较结果。 Just providing a simple sinusoid with an unknown frequency is not a proper way to check a codec. 仅提供具有未知频率的简单正弦曲线不是检查编解码器的正确方法。 However, in the example you can see different impact on the initial signal at different frequency. 但是,在该示例中,您可以看到不同频率对初始信号的不同影响。

buffer1.push(Math.sin(2*Math.PI*i*frequency/sampleRate));

I think you should build an example with a recorded voice and compare results in this case. 我认为你应该用录制的声音建立一个例子并在这种情况下比较结果。 It would be more proper. 这会更合适。

In general to get the idea in detail you would have to examine digital signal processing. 通常,为了详细了解您的想法,您必须检查数字信号处理。 I can't even provide a proper link since it is a whole science and it is mathematically intensive. 我甚至无法提供正确的链接,因为它是一门完整的科学,而且在数学上是密集的。 (the only proper book for reading I know is in Russian ). (我知道的唯一正确的阅读书是俄语 )。 If anyone here with strong mathematics background can share proper literature for this case I would appreciate. 如果这里有强大数学背景的人可以分享适合这个案例的文献,我将不胜感激。

EDIT: as mentioned by Kuroi Neko, there is a trouble with the boundaries of the buffer. 编辑:正如Kuroi Neko所提到的,缓冲区的边界存在问题。 And seems like it is impossible to save decoder state as mentioned in this post , because the library in use doesn't support it. 而且好像它是不可能保存解码器状态,在此提到的职位 ,因为在使用该库不支持它。 If you look at the source code you see that they use a third party speex codec and do not provide full access to it's features. 如果您查看源代码,您会看到他们使用第三方speex编解码器,并且不提供对其功能的完全访问权限。 I think the best approach would be to find a decent library for speex that supports state recovery similar to this 我认为最好的方法是为speex找到一个像样的库来支持类似于此的状态恢复

Speex is a lossy codec, so the output is only an approximation of your initial sine wave. Speex是一个有损编解码器,因此输出只是初始正弦波的近似值。

Your sine frequency is about 7 KHz, which is near the upper codec 8KHz bandwith and as such even more likely to be altered. 您的正弦频率约为7 KHz,接近上编解码器8KHz带宽,因此更有可能被改变。

What the codec outputs looks like a comb of dirach pulses that will sound like your initial sinusoid as heard through a phone, which is certainly different from the original. 编解码器输出的内容看起来就像是一组反射脉冲,听起来像是通过手机听到的初始正弦曲线,这肯定与原始脉冲不同。

See this fiddle where you can listen to what the codec makes of your original sine waves, be them split in half or not. 看看这个小提琴 ,你可以听到编解码器对原始正弦波的影响,将它们分成两半。

//Generate a continus sinus in 2 arrays
var len = 16384;
var buffer1 = [];
var buffer2 = [];
var buffer = [];
for(var i=0;i<len;i++){
    buffer.push(Math.sin(i/10));
    if(i < len/2)
        buffer1.push(Math.sin(i/10));
    else
        buffer2.push(Math.sin(i/10));
}
//Encode and decode both arrays seperatly
var en = Codec.encode(buffer1);
var dec1 = Codec.decode(en);

var en = Codec.encode(buffer2);
var dec2 = Codec.decode(en);

//Merge the arrays to 1 output array
var merge = [];
for(var i in dec1)
    merge.push(dec1[i]);

for(var i in dec2)
    merge.push(dec2[i]);

//encode and decode the whole array
var en = Codec.encode(buffer);
var dec = Codec.decode(en);

//-----------------
//Down under is only for playing the 2 different arrays
//-------------------
var audioCtx = new window.AudioContext || new window.webkitAudioContext;
function play (sound)
{
    var audioBuffer = audioCtx.createBuffer(1, sound.length, 44100);
    var bufferData = audioBuffer.getChannelData(0);
    bufferData.set(sound);

    var source = audioCtx.createBufferSource();
    source.buffer = audioBuffer;
    source.connect(audioCtx.destination);
    source.start();
}

$("#o").click(function() { play(dec); });
$("#c1").click(function() { play(dec1); });
$("#c2").click(function() { play(dec2); });
$("#m").click(function() { play(merge); });

If you merge the two half signal decoder outputs, you will hear an additional click due to the abrupt transition from one signal to the other, sounding basically like a relay commutation. 如果合并两个半信号解码器输出,由于从一个信号到另一个信号的突然转换,您将听到额外的咔嗒声,听起来基本上像继电器换向。
To avoid that you would have to smooth the values around the merging point of your two buffers. 为避免这种情况,您必须平滑两个缓冲区合并点周围的值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM