简体   繁体   English

webRTC真的有可能在没有噪音的情况下传输高质量的音频吗?

[英]Is it really possible for webRTC to stream high quality audio without noise?

I have tested with the highest quality settings and multiple STUN/TURN servers with no luck in finding a real high quality stream.我已经用最高质量的设置和多个 STUN/TURN 服务器进行了测试,但没有找到真正高质量的流。

In my experience webRTC always has a fluctuating and limited bandwidth and a high level of background noise that doesn't reach the quality of mp3/Shoutcast/Icecast radio streams.根据我的经验,webRTC 总是具有波动且有限的带宽和高水平的背景噪音,无法达到 mp3/Shoutcast/Icecast 无线电流的质量。

Has anyone found a way to provide a real high bandwidth audio stream with webRTC or is it not actually possible at this time?有没有人找到一种通过 webRTC 提供真正高带宽音频流的方法,或者目前实际上不可能?

The default audio settings for WebRTC are pretty low. WebRTC 的默认音频设置非常低。 It defaults to mono audio around 42 kb/s as it seems to be designed for voice.它默认为大约 42 kb/s 的单声道音频,因为它似乎是为语音设计的。 I increased the quality by configuring a few settings.我通过配置一些设置来提高质量。

  1. Disable autoGainControl , echoCancellation and noiseSuppression in the getUserMedia() constraints:在 getUserMedia() 约束中禁用autoGainControlechoCancellationnoiseSuppression
navigator.mediaDevices.getUserMedia({
  audio: {
    autoGainControl: false,
    channelCount: 2,
    echoCancellation: false,
    latency: 0,
    noiseSuppression: false,
    sampleRate: 48000,
    sampleSize: 16,
    volume: 1.0
  }
});
  1. Add the stereo and maxaveragebitrate attributes to the SDP:stereomaxaveragebitrate属性添加到 SDP:
let answer = await peer.conn.createAnswer(offerOptions);
answer.sdp = answer.sdp.replace('useinbandfec=1', 'useinbandfec=1; stereo=1; maxaveragebitrate=510000');
await peer.conn.setLocalDescription(answer);

This gives a potential maximum bitrate of 520kbps for stereo, which is 260kbps per channel!这为立体声提供了 520kbps 的潜在最大比特率,即每通道 260kbps!

Actual bitrate depends on the speed of your network and strength of your signal.实际比特率取决于您的网络速度和信号强度。

More information about the SDP:有关 SDP 的更多信息:

The Session Description Protocol (SDP) [RFC4566] describes various aspects of multimedia session such as media capabilities, transport addresses and related metadata in a transport agnostic manner, for the purposes of session announcement, session invitation and parameter negotiation.会话描述协议 (SDP) [RFC4566] 以传输不可知的方式描述多媒体会话的各个方面,例如媒体能力、传输地址和相关元数据,用于会话公告、会话邀请和参数协商。

https://tools.ietf.org/id/draft-nandakumar-rtcweb-sdp-01.html#rfc.section.3 https://tools.ietf.org/id/draft-nandakumar-rtcweb-sdp-01.html#rfc.section.3

Check out my project which implements these features: https://github.com/kmturley/webrtc-radio查看我实现这些功能的项目: https : //github.com/kmturley/webrtc-radio

Firstly, its worth saying that Web RTC builds on the underlying network connectivity and if it is poor then there is very little any higher layers can do to avoid this.首先,值得一提的是,Web RTC 建立在底层网络连接之上,如果它很差,那么任何更高层都几乎无法避免这种情况。

Looking at the particular comparison you have highlighted, there are a couple of factors which are key to VoIP voice quality (assuming you are focused on voice from the question):查看您突出显示的特定比较,有几个因素是 VoIP 语音质量的关键(假设您专注于问题中的语音):

  • Latency: to avoid delay and echo, voice communication needs a low end to end latency.延迟:为了避免延迟和回声,语音通信需要低的端到端延迟。 The target for good quality VoIP systems is usually sub 200 ms latency.高质量 VoIP 系统的目标通常是低于 200 毫秒的延迟。
  • Jitter - this is essentially the variance in the latency one time, ie how the end to end delay varies over time.抖动 - 这实质上是一次延迟的变化,即端到端延迟如何随时间变化。
  • Packet loss - voice is actually reasonably tolerant to packet loss compared to data.数据包丢失 - 与数据相比,语音实际上可以合理地容忍数据包丢失。 VoIp targets are typically in the 1% or less range. VoIP 目标通常在 1% 或更少的范围内。

Comparing this with steamed radio etc, the key point is the latency - it is not unusual to wait several seconds for a stream to start playing back.将此与流媒体广播等进行比较,关键点是延迟 - 等待几秒钟才能开始播放流并不罕见。

This allows the receiver to fill a much bigger buffer of packets waiting to be decoded and played back, and makes it much more tolerant of variations in the latency (jitter).这允许接收器填充更大的等待解码和回放的数据包缓冲区,并使其更能容忍延迟(抖动)的变化。

Taking a simple example, if you had a brief half second interruption in your connection, this would immediately impact a two way VoIP call, but it might not impact streamed audio at all, assuming the network recovers fully and the buffer had several seconds worth of content in it at the time.举一个简单的例子,如果你的连接有半秒的短暂中断,这将立即影响双向 VoIP 呼叫,但它可能根本不会影响流式音频,假设网络完全恢复并且缓冲区有几秒钟的价值当时里面的内容。

So the quality difference you are seeing compared to streamed audio are most likely related to the real tine nature of the communication, rather than with inherent WebRTC faults - or maybe more precisely, even if WebRTC was perfect, real time two way VoIP is very susceptible to network conditions.因此,与流式音频相比,您看到的质量差异很可能与通信的真实本质有关,而不是与 WebRTC 固有的故障有关 - 或者更准确地说,即使 WebRTC 是完美的,实时双向 VoIP 也很容易受到影响到网络条件。

As.作为。 a note, video cleary needs much more bandwidth, and is also impacted by the network but people tend to be more tolerant of video 'stutters' than voice quality issues in multimedia calls (at this time amyay).请注意,视频清晰需要更多的带宽,并且也受网络的影响,但人们往往比多媒体通话中的语音质量问题更能容忍视频“断断续续”(此时 amyay)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM