简体繁体 English

PyAudio 中从麦克风到扬声器的实时流式传输

[英]real-time streaming from mic to speaker in PyAudio

原文 2022-08-20 23:50:59 7 1 python-3.x/ pyaudio

I'd like to stream audio in real-time from mic to speaker using PyAudio, with an opportunity to read / modify / write the sample buffers as they go by.我想使用 PyAudio 实时从麦克风到扬声器的 stream 音频，并有机会读取/修改/写入样本缓冲区，因为它们是 go。

What is the idiomatically correct way to do this in PyAudio?在 PyAudio 中执行此操作的惯用正确方法是什么？

I understand that in callback mode, the output stream driving the speaker wants to "pull" samples in its callback function.我知道在回调模式下，驱动扬声器的 output stream 想要在其回调 function 中“拉”样本。 Similarly, the input stream consuming samples from the microphone wants to "push" samples in its callback function.同样，输入 stream 消费来自麦克风的样本想要在其回调 function 中“推送”样本。 I also understand that callbacks run in their own threads, and that the docs say:我也知道回调在他们自己的线程中运行，并且文档说：

| | Do not call Stream.read() or Stream.write() if using non-blocking operation.如果使用非阻塞操作，请勿调用 Stream.read() 或 Stream.write()。

Given those constraints, it's not clear how to connect a microphone's stream to a speaker's stream.鉴于这些限制，目前尚不清楚如何将麦克风的 stream 连接到扬声器的 stream。 (And I understand the complexities if the microphone and speaker clocks are not synchronized.) （而且我理解如果麦克风和扬声器时钟不同步会很复杂。）

Assuming that the microphone and speaker clocks ARE synchronized, how would you stream from mic to speaker?假设麦克风和扬声器时钟是同步的，您将如何将 stream 从麦克风连接到扬声器？

1 个解决方案

[As soon as I hit [send], the following occurred to me. [我一点击[发送]，我就想到了以下内容。 Let me know if its the right idiom for PyAudio...]让我知道它是否适合 PyAudio...]

You could allocate three frames ahead of time and cycle through them: one gets passed to the microphone callback, one is available for read/modify/write processing, the third gets passed to the speaker callback.您可以提前分配三帧并在它们之间循环：一帧传递给麦克风回调，一帧可用于读取/修改/写入处理，第三帧传递给扬声器回调。 (You might actually need four frames to allow for latency and delays.) （您实际上可能需要四帧来考虑延迟和延迟。）

I didn't see any documentation on the format of the frames themselves;我没有看到任何关于框架本身格式的文档； are they just arrays of ints (for 16 bit sample data)?它们只是整数的 arrays（用于 16 位样本数据）吗？ [UPDATE: "out_data is a byte array whose length should be the (frame_count * channels * bytes-per-channel)"] So that's easy... [更新：“out_data 是一个字节数组，其长度应为 (frame_count * channels * bytes-per-channel)”] 所以这很容易......