简体   繁体   English

以编程方式实时更改音频文件的速度

[英]Programmatically change the speed of an audio file in real-time

Environment 环境

  • Hardware: Raspberry Pi x 硬件:Raspberry Pi x
  • OS: Raspbian Jessie Lite 操作系统:Raspbian Jessie Lite
  • Language: Qt5 / C++ 语言:Qt5 / C ++

Goal 目标

Execute an audio file (wav or better mp3) changing its speed smoothly and countinuosly. 执行音频文件(wav或更好的mp3),平稳且准确地更改其速度。 The pitch should change according to the speed (playback rate). 音高应根据速度(播放速率)而变化。 My application updates several times per second a variable that contains the desired speed: ie 1.0 = normal speed. 我的应用程序每秒更新几次包含所需速度的变量:即1.0 =正常速度。 Required range is about 0.2 .. 3.0, with a resolution of 0.01. 所需范围约为0.2 .. 3.0,分辨率为0.01。

The audio is likely music, expected format: mono, 16-bit, 11.025 Hz. 音频可能是音乐,是预期的格式:单声道,16位,11.025 Hz。 No specific constraints about latency: below 500 ms is acceptable. 对延迟没有特殊限制:低于500毫秒是可以接受的。

Some thougths 一些思想

QMediaPlayer in QtMultimedia has the playbackRate property that should do exactly this. QtMultimedia中的QMediaPlayer具有应执行此操作的playingRate属性。 Unfortunately I have never be able to make QtMultimedia work in my systems. 不幸的是,我无法使QtMultimedia在我的系统中工作。

It's ok to use also an external player, and send data using pipes or any IPC. 也可以使用外部播放器,并使用管道或任何IPC发送数据。

How would you achieve this? 您将如何实现?

I don't know how much of this translates to C++. 我不知道有多少可以转换为C ++。 The work I did on this problem uses Java. 我在此问题上所做的工作使用Java。 Still, something of the algorithm should be of help. 尽管如此,某些算法还是有帮助的。

Example data (made up): 示例数据(组成):

sample    value
0          0.0
1          0.3
2          0.5
3          0.6
4          0.2
5         -0.1
6         -0.4

With normal speed, we send the output line a series of values where the sample number increments by 1 per output frame. 以正常速度,我们向输出线发送一系列值,其中样本数在每个输出帧中递增1。

If we were going slower, say half speed, we should output twice as many values before reaching the same point in the media data. 如果速度变慢,例如速度减半,则在到达媒体数据中的同一点之前,我们应该输出两倍的值。 In other words, we need to include, in our output, values that are at the non-existent, intermediate sample frame locations 0.5, 1.5, 2.5, ... 换句话说,我们需要在输出中包括不存在的中间样本帧位置0.5、1.5、2.5,...处的值。

To do this, it turns out that linear interpolation works quite well for audio. 为此,事实证明线性插值对音频非常有效。 It is possible to use a more sophisticated curve fitting algorithm but the increase in fidelity is not considered to be worth the trouble. 可以使用更复杂的曲线拟合算法,但保真度的提高并不值得解决。

So, we end up with a stream as follows (for half speed): 因此,我们最终得到如下流(半速):

sample    value
0          0.0
0.5        0.15
1          0.3
1.5        0.4
2          0.5
2.5        0.55
3          0.6
etc.

If you want to play back 3/4 speed, then the positions and values used in the output would be this: 如果要播放3/4速度,则输出中使用的位置和值将为:

sample    value
0          0.0
0.75       0.225
1.5        0.4
2.25       0.525
3          0.6
3.75       0.525
etc.

I code this via a "cursor" that is incremented each sample frame, where the increment amount determines the "speed" of the playback. 我通过“光标”对它进行编码,每个样本帧对其进行递增,其中增量的大小决定了回放的“速度”。 The cursor points into an array, like an integer index would, but instead, is a float (or double). 游标指向一个数组,就像整数索引一样,但是它是一个浮点数(或双精度数)。 If there is a fractional part to the cursor's value, the fraction is used to interpolate between sample values pointed to by the integer part and the integer part plus one. 如果游标的值有小数部分,则该分数用于在整数部分和整数部分加1所指向的样本值之间进行插值。

For example, if the cursor was 6.25, and the value of soundData[6] was A and the value of soundData[6+1] was B, the sound value would be: 例如,如果光标为6.25,并且soundData [6]的值为A,而soundData [6 + 1]的值为B,则声音值为:

audioValue = A * 0.75 + B * 0.25

The degree of precision with which you can define your speed increment is quite high. 您可以定义速度增量的精度很高。 I think Java's floats are considered sufficient for this purpose. 我认为Java的float足以满足此目的。

As for keeping a dynamically changing speed increment smooth, I am spreading out the changes to new speeds over a series of 4096 steps (roughly 1/10th of a second, at 44100 fps). 为了使动态变化的速度增量保持平稳,我将在一系列的4096步(约1/10秒的时间,以44100 fps的速度)上扩展对新速度的更改。 Change requests are often asynchronous, eg, from a GUI, and are spread out over time in a somewhat unpredictable way. 变更请求通常是异步的(例如,来自GUI),并且以某种无法预测的方式随时间分布。 The smoothing algorithm should be able to recalculate and update itself with each new speed request. 平滑算法应该能够根据每个新的速度请求重新计算并更新自身。

Following is a link that demonstrates both strategies, where a sound's playback speed is altered in real time via a slider control. 以下是演示这两种策略的链接,其中,通过滑块控件实时更改声音的播放速度。

SlidersTest.jar SlidersTest.jar

This is a runnable copy of the jar file that also contains the source code, and executes via Java 8. You can also rename the file SlidersTest.zip and then drill in to view the source code, in context. 这是jar文件的可运行副本,它也包含源代码,并通过Java 8执行。您还可以重命名文件SlidersTest.zip,然后深入查看上下文中的源代码。

But links to the source files can also be navigated to directly in the two following sections of a page I posted for this code I recently wrote and made open source: see AudioCue.java see SlidersTest.java 但是,到源文件的链接也可以直接导航到我发布的该代码的页面的以下两个部分中,该代码是我最近编写并开放源代码的:请参见AudioCue.java参见SlidersTest.java

AudioCue.java is a long file. AudioCue.java是一个长文件。 The relevant parts are in the inner class at the end of the file: class AudioCuePlayer , and for the smoothing algorithm, check the setter method setSpeed which is about 3/4's of the way down. 相关部分位于文件末尾的内部类中:class AudioCuePlayer ,对于平滑算法,请检查设置方法setSpeed ,该方法大约下降了3/4。 Sorry I don't have line numbers. 对不起,我没有行号。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM