简体   繁体   English

将多声道PyAudio转换为NumPy数组

[英]Convert multi-channel PyAudio into NumPy array

All the examples I can find are mono, with CHANNELS = 1 . 我能找到的所有例子都是单声道, CHANNELS = 1 How do you read stereo or multichannel input using the callback method in PyAudio and convert it into a 2D NumPy array or multiple 1D arrays? 如何使用PyAudio中的回调方法读取立体声或多声道输入并将其转换为2D NumPy数组或多个1D数组?

For mono input, something like this works: 对于单声道输入,这样的工作:

def callback(in_data, frame_count, time_info, status):
    global result
    global result_waiting

    if in_data:
        result = np.fromstring(in_data, dtype=np.float32)
        result_waiting = True
    else:
        print('no input')

    return None, pyaudio.paContinue

stream = p.open(format=pyaudio.paFloat32,
                channels=1,
                rate=fs,
                output=False,
                input=True,
                frames_per_buffer=fs,
                stream_callback=callback)

But does not work for stereo input, the result array is twice as long, so I assume the channels are interleaved or something, but I can't find documentation for this. 但是对于立体声输入不起作用, result数组的长度是原来的两倍,所以我假设通道是交错的或者其他东西,但是我找不到这方面的文档。

It appears to be interleaved sample-by-sample, with left channel first. 它似乎是逐个样本交错的,左声道是第一个。 With signal on left channel input and silence on right channel, I get: 左声道输入信号和右声道静音,我得到:

result = [0.2776, -0.0002,  0.2732, -0.0002,  0.2688, -0.0001,  0.2643, -0.0003,  0.2599, ...

So to separate it out into a stereo stream, reshape into a 2D array: 因此,要将其分离为立体声流,请重塑为2D阵列:

result = np.fromstring(in_data, dtype=np.float32)
result = np.reshape(result, (frames_per_buffer, 2))

Now to access the left channel, use result[:, 0] , and for right channel, use result[:, 1] . 现在访问左声道,使用result[:, 0] ,对于右声道,使用result[:, 1]

def decode(in_data, channels):
    """
    Convert a byte stream into a 2D numpy array with 
    shape (chunk_size, channels)

    Samples are interleaved, so for a stereo stream with left channel 
    of [L0, L1, L2, ...] and right channel of [R0, R1, R2, ...], the output 
    is ordered as [L0, R0, L1, R1, ...]
    """
    # TODO: handle data type as parameter, convert between pyaudio/numpy types
    result = np.fromstring(in_data, dtype=np.float32)

    chunk_length = len(result) / channels
    assert chunk_length == int(chunk_length)

    result = np.reshape(result, (chunk_length, channels))
    return result


def encode(signal):
    """
    Convert a 2D numpy array into a byte stream for PyAudio

    Signal should be a numpy array with shape (chunk_size, channels)
    """
    interleaved = signal.flatten()

    # TODO: handle data type as parameter, convert between pyaudio/numpy types
    out_data = interleaved.astype(np.float32).tostring()
    return out_data

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将多通道numpy数组转换为photoshop PSD文件 - Convert multi-channel numpy array to photoshop PSD file 使用 OpenCV 和 Numpy 从另一个图像中提取的像素坐标创建多通道图像 - Create a multi-channel image from pixel coordinates extracted from another image with OpenCV and Numpy 如何有效地将转换应用于多通道numpy ndarray? - How do I efficiently apply a transform to a multi-channel numpy ndarray? 我在numpy中实现的多通道一维卷积有什么问题(与tensorflow相比) - What is wrong with my multi-channel 1d convolution implemented in numpy (compared with tensorflow) 从多通道图像中提取通道名称 - extract channel names from a multi-channel image 如何用多通道一维对象训练神经网络? - How to train neural networks with multi-channel 1D objects? 支持多渠道订阅者的Python可观察实现 - Python observable implementation that supports multi-channel subscribers 创建多通道网络:“连接”对象没有“形状”属性 - Creating a multi-channel network: 'Concatenate' object has no attribute 'shape' 在 Pytorch 中使用 BCEWithLogitsLoss 的多通道 2D 掩码权重 - Multi-channel, 2D mask weights using BCEWithLogitsLoss in Pytorch 多渠道漏斗 API - 未知 API 或版本 - Multi-Channel Funnel API - Unknown API or Version
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM