如何自然平移音频样本数据？

Question

I'm developing Flutter plugin which is targeting only Android for now.我正在开发 Flutter 插件，目前仅针对 Android。 It's kind of synthesis thing;这是一种综合的东西； Users can load audio file into memory, and they can adjust pitch (not pitch shift) and play multiple sound with the least delay using audio library called Oboe .用户可以将音频文件加载到 memory 中，他们可以使用名为Oboe的音频库调整音高（不是音高偏移）并以最小的延迟播放多个声音。

I managed to get PCM data from audio files which MediaCodec class supports, and also succeeded to handle pitch by manipulating playback via accessing PCM array manually too.我设法从MediaCodec class 支持的音频文件中获取 PCM 数据，并且还通过手动访问 PCM 数组来操纵播放来成功处理音高。

This PCM array is stored as float array, ranging from -1.0 to 1.0.此 PCM 数组存储为浮点数组，范围从 -1.0 到 1.0。 I now want to support panning feature, just like what internal Android class such as SoundPool .我现在想支持平移功能，就像内部 Android class 等SoundPool 一样。 I'm planning to follow how SoundPool is handling panning.我计划关注 SoundPool 如何处理平移。 There are 2 values I have to pass to SoundPool when performing panning effect: left, and right.在执行平移效果时，我必须将 2 个值传递给 SoundPool：左和右。 These 2 values are float, and must range from 0.0 to 1.0.这 2 个值是浮点数，并且必须在 0.0 到 1.0 的范围内。

For example, if I pass (1.0F, 0.0F), then users can hear sound only by left ear.例如，如果我通过 (1.0F, 0.0F)，那么用户只能用左耳听到声音。 (1.0F, 1.0F) will be normal (center). （1.0F，1.0F）将是正常的（中心）。 Panning wasn't problem... until I encountered handling stereo sounds.平移不是问题......直到我遇到处理立体声。 I know what to do to perform panning with stereo PCM data, but I don't know how to perform natural panning.我知道如何使用立体声 PCM 数据执行平移，但我不知道如何执行自然平移。

If I try to shift all sound to left side, then right channel of sound must be played in left side.如果我尝试将所有声音移到左侧，则必须在左侧播放右声道。 In opposite, if I try to shift all sound to right side, then left channel of sound must be played in right side.相反，如果我尝试将所有声音转移到右侧，那么声音的左声道必须在右侧播放。 I also noticed that there is thing called Panning Rule , which means that sound must be a little bit louder when it's shifted to side (about +3dB).我还注意到有一个叫做Panning Rule的东西，这意味着当声音移到一边时（大约 +3dB），声音一定要大一点。 I tried to find a way to perform natural panning effect, but I really couldn't find algorithm or reference of it.我试图找到一种方法来执行自然平移效果，但我真的找不到它的算法或参考。

Below is structure of float stereo PCM array, I actually didn't modify array when decoding audio files, so it should be common structure下面是float stereo PCM array的结构，其实我解码音频文件的时候没有修改array，应该是普通的结构

[left_channel_sample_0, right_channel_sample_0, left_channel_sample_1, right_channel_sample_1,
...,
left_channel_sample_n, right_channel_sample_n]

and I have to pass this PCM array to audio stream like c++ code below我必须将此 PCM 阵列传递给音频 stream，如下面的 c++ 代码

void PlayerQueue::renderStereo(float * audioData, int32_t numFrames) {
    for(int i = 0; i < numFrames; i++) {
        //When audio file is stereo...
        if(player->isStereo) {
            if((offset + i) * 2 + 1 < player->data.size()) {
                audioData[i * 2] += player->data.at((offset + i) * 2);
                audioData[i * 2 + 1] += player->data.at((offset + i) * 2 + 1);
            } else {
                //PCM data reached end
                break;
            }
        } else {
            //When audio file is mono...
            if(offset + i < player->data.size()) {
                audioData[i * 2] += player->data.at(offset + i);
                audioData[i * 2 + 1] += player->data.at(offset + i);
            } else {
                //PCM data reached end
                break;
            }
        }

        //Prevent overflow
        if(audioData[i * 2] > 1.0)
            audioData[i * 2] = 1.0;
        else if(audioData[i * 2] < -1.0)
            audioData[i * 2] = -1.0;

        if(audioData[i * 2 + 1] > 1.0)
            audioData[i * 2 + 1] = 1.0;
        else if(audioData[i * 2 + 1] < -1.0)
            audioData[i * 2 + 1] = -1.0;
    }

    //Add numFrames to offset, so it can continue playing PCM data in next session
    offset += numFrames;

    if(offset >= player->data.size()) {
        offset = 0;
        queueEnded = true;
    }
}

I excluded calculation of playback manipulating to simplify code.我排除了播放操作的计算以简化代码。 As you can see, I have to manually pass PCM data to audioData float array.如您所见，我必须手动将 PCM 数据传递给audioData浮点数组。 I'm adding PCM data to perform mixing multiple sounds including same sound too.我正在添加 PCM 数据来执行混合多种声音，包括相同的声音。

How to perform panning effect with this PCM array?如何使用此 PCM 阵列执行平移效果？ It will be good if we can follow mechanisms of SoundPool , but it will be fine as long as I can perform panning effect properly.如果我们能遵循SoundPool的机制就好了，但只要我能正确执行平移效果就可以了。 (EX: pan value can be just -1.0 to 1.0, 0 will mean centered) （例如：平移值可以只是 -1.0 到 1.0，0 表示居中）
When applying Panning Rule, what is relationship between PCM and decibel?应用 Panning Rule 时，PCM 和分贝之间有什么关系？ I know how to make sound louder, but I don't know how to make sound louder with exact decibel.我知道如何使声音更响亮，但我不知道如何以精确的分贝使声音更响亮。 Are there any formula for this?这有什么公式吗？

Answer 1

Pan rules or pan laws are implemented a bit different from manufacturer to manufacturer.平移规则或平移法的实施因制造商而异。

One implementation that is frequently used is that when sounds are panned fully to one side, that side is played at full volume, where as the other side is attenuated fully.经常使用的一种实现是，当声音完全平移到一侧时，该侧以最大音量播放，而另一侧则完全衰减。 if the sound is played at center, both sides are attenuated by roughly 3 decibels.如果声音在中心播放，两边都会衰减大约 3 分贝。

to do this you can multiply the sound source by the calculated amplitude.为此，您可以将声源乘以计算的幅度。 eg (untested pseudo code)例如（未经测试的伪代码）

player->data.at((offset + i) * 2) * 1.0; // left signal at full volume
player->data.at((offset + i) * 2 + 1) * 0.0; // right signal fully attenuated

To get the desired amplitudes you can use the sin function for the left channel and the cos function for the right channel.要获得所需的幅度，您可以将sin function 用于左通道，将cos function 用于右通道。

notice that when the input to sin and cos is pi/4, that the amplitude is 0.707 on both sides.请注意，当 sin 和 cos 的输入为 pi/4 时，两侧的幅度为 0.707。 This will give you your attenuation on both sides of around 3 decibels.这将使您在两侧衰减约 3 分贝。

So all that is left to do is to map the range [-1, 1] to the range [0, pi/2] eg assuming you have a value for pan which is in the range [-1, 1].因此，剩下要做的就是 map 范围 [-1, 1] 到范围 [0, pi/2]，例如假设您的pan值在 [-1, 1] 范围内。 (untested pseudo code) （未经测试的伪代码）

pan_mapped = ((pan + 1) / 2.0) * (Math.pi / 2.0);

left_amplitude = sin(pan_mapped);
right_amplitude = cos(pan_mapped);

UPDATE:更新：

Another option frequently used (eg ProTools DAW) is to have a pan setting on each side.另一个经常使用的选项（例如 ProTools DAW）是在每一侧都有一个平移设置。 effectively treating the stereo source as 2 mono sources.有效地将立体声源视为 2 个 mono 源。 This allows you to place the left source freely in the stereo field without affecting the right source.这使您可以在立体声场中自由放置左音源，而不会影响右音源。

To do this you would: (untested pseudo code)为此，您将：（未经测试的伪代码）

left_output  += left_source(i)  * sin(left_pan)
right_output += left_source(i)  * cos(left_pan)
left_output  += right_source(i) * sin(right_pan)
right_output += right_source(i) * cos(right_pan)

The setting of these 2 pans are are up to the operator and depend on the recording and desired effect.这 2 个声像的设置取决于操作员，取决于录音和所需的效果。 How you want to map this to a single pan control is up to you.您希望如何将 map 转换为单个平移控制取决于您。 I would just advise that when the pan is 0 (centred) that the left channel is played only on the left side and the right channel is only played on the right side.我只是建议当平移为 0（居中）时，左声道仅在左侧播放，右声道仅在右侧播放。 Else you would interfere with the original stereo recording.否则你会干扰原始的立体声录音。

One possibility would be that the segment [-1, 0) controls the right pan, leaving the left side untouched.一种可能性是段 [-1, 0) 控制右侧平移，而左侧保持不变。 and vice versa for [0, 1].对于 [0, 1]，反之亦然。

hPi = math.pi / 2.0
  
def stereoPan(x):
    if (x < 0.0):
        print("left source:")
        print(1.0) # amplitude to left channel
        print(0.0) # amplitude to right channel
        print("right source:")
        print(math.sin(abs(x) * hPi)) # amplitude to left channel
        print(math.cos(abs(x) * hPi)) # amplitude to right channel

    else:
        print("left source:")
        print(math.cos(x * hPi)) # amplitude to left channel
        print(math.sin(x * hPi)) # amplitude to right channel  
        print("right source:")
        print(0.0) # amplitude to left channel
        print(1.0) # amplitude to right channel

Answer 2

The following is not meant to contradict anything in the excellent answer given by @ruff09.以下内容并不与@ruff09 给出的出色答案相矛盾。 I'm just going to add some thoughts and theory that I think is relevant when trying to emulate panning.我将添加一些我认为在尝试模拟平移时相关的想法和理论。

I'd like to point out that simply using volume differences has a couple drawbacks.我想指出，简单地使用音量差异有几个缺点。 First off, it doesn't match the real world phenomenon.首先，它与现实世界的现象不符。 Imagine you are walking down a sidewalk and immediately there on the street, on your right, is a worker with a jackhammer.想象一下，你正走在人行道上，马上就在街上，在你的右边，有一个拿着手提钻的工人。 We could make the sound 100% volume on the right and 0% on the left.我们可以让声音的音量在右边为 100%，在左边为 0%。 But in reality much of what we hear from that source is also coming in the left ear, drowning out other sounds.但实际上，我们从那个来源听到的大部分声音也来自左耳，淹没了其他声音。

If you omit left-ear volume for the jackhammer to obtain maximum right-pan, then even quiet sounds on the left will be audible (which is absurd), since they will not be competing with jackhammer content on that left track.如果您省略手提钻的左耳音量以获得最大的右声像，那么即使是左边的安静声音也会被听到（这是荒谬的），因为它们不会与左声道上的手提钻内容竞争。 If you do have left-ear volume for the jackhammer, then the volume-based panning effect will swing the location more towards the center.如果您确实为手提钻设置了左耳音量，那么基于音量的声像效果将 swing 的位置更靠近中心。 Dilemma!困境！

How do our ears differentiate locations in such situations?在这种情况下，我们的耳朵如何区分位置？ I know of two processes that potentially can be incorporated to the panning algorithm to make the panning more "natural."我知道有两个过程可能会被合并到平移算法中，以使平移更加“自然”。 One is a filtering component.一个是过滤组件。 High frequencies that match wavelengths that are smaller than the width of our head get attenuated.匹配小于我们头部宽度的波长的高频会被衰减。 So, you could add some differential low-pass filtering to your sounds.因此，您可以为声音添加一些差分低通滤波。 Another aspect is that in our scenario, the jackhammer sounds reach the right ear a few milliseconds before they reach the left.另一个方面是，在我们的场景中，手提钻的声音在到达左耳前几毫秒到达右耳。 Thus, you could also add a bit of delay to based on the panning angle.因此，您还可以根据平移角度添加一些延迟。 The time-based panning effect works most clearly with frequency content that has wave lengths that are larger than our heads (eg, some high-pass filtering would also be a component).基于时间的平移效果最清楚地适用于波长大于我们头部的频率内容（例如，一些高通滤波也是一个组件）。

There has also been a great deal of work on how the shapes of our ears have differential filtering effects on sounds.关于我们耳朵的形状如何对声音产生不同的过滤效果，也有大量的工作。 I think that we learn to use this as we grow up by subconsciously associating different timbres with different locations (especially pertains to altitude and front vs. back stereo issues).我认为随着我们的成长，我们会通过下意识地将不同的音色与不同的位置相关联来学习使用它（尤其是与高度和前后立体声问题有关）。

There are big computation costs, though.但是，计算成本很高。 So simplifications such as sticking with purely amplitude-based panning is the norm.因此，诸如坚持纯粹基于幅度的平移之类的简化是常态。 Thus, for sounds in a 3D world, it is probably best to prefer mono source content for items that need dynamic location changes, and only use stereo content for background music or ambient content that doesn't need dynamic panning based on player location.因此，对于 3D 世界中的声音，最好将 mono 源内容用于需要动态位置更改的项目，并且仅将立体声内容用于背景音乐或不需要基于播放器位置进行动态平移的环境内容。

I want to do some more experimenting with dynamic time-based panning combined with a bit of amplitude, to see if this can be used effectively with stereo cues.我想对基于时间的动态平移和一点幅度进行更多试验，看看这是否可以有效地用于立体声提示。 Implementing a dynamic delay is a little tricky, but not as costly as filtering.实现动态延迟有点棘手，但不像过滤那么昂贵。 I'm wondering if there might be ways to record a sound source (preprocess it) to make it more amenable to incorporating real-time filter- and time-based manipulation that result in effective panning.我想知道是否有办法记录声源（对其进行预处理），以使其更易于结合实时过滤器和基于时间的操作，从而实现有效的平移。

如何自然平移音频样本数据？

问题描述

2 个解决方案

解决方案1
2 已采纳 2021-04-12 19:20:42

解决方案2
0 2021-04-13 20:10:36

如何自然平移音频样本数据？

问题描述

2 个解决方案

解决方案1 2 已采纳 2021-04-12 19:20:42

解决方案2 0 2021-04-13 20:10:36

解决方案1
2 已采纳 2021-04-12 19:20:42

解决方案2
0 2021-04-13 20:10:36