简体繁体 English

使用Python分析当前播放的音频

[英]Analyze currently playing audio with Python

原文 2012-03-29 19:34:27 5 1 python/ audio/ directshow

I'd like to build a small Python program that can listen to and analyze currently playing audio on a computer, for example, from any media player. 我想构建一个小型Python程序，可以在计算机上收听和分析当前播放的音频，例如，从任何媒体播放器。

I know that this is possible with DirectShow on Windows, but I'm not sure how to use it from Python. 我知道这可以在Windows上使用DirectShow，但我不确定如何在Python中使用它。 However, I'd ideally like a cross-platform way that does not use DirectX. 但是，我理想地喜欢不使用DirectX的跨平台方式。

1 个解决方案

In general, to "listen to something" from your sound card you are going to have to use some audio toolkit / module and usually, you will end up setting up a record-process-play routine (you can ommit play of course) 一般来说，要从声卡“听某事”，你将不得不使用一些音频工具包/模块，通常，你最终会设置一个记录过程播放程序（当然你可以省略播放）

If your application is not a hard real-time one (ie you can afford to miss a few samples from the input) you could start off with PyAudio's "Record a few seconds of audio and save it to a file" example from their website . 如果您的应用程序不是硬实时应用程序（即您可以从输入中错过一些样本），您可以从PyAudio的“记录几秒音频并将其保存到文件”开始，从他们的网站开始。
So in your case, you would: 所以在你的情况下，你会：

Record a buffer 记录缓冲区
Process it 处理它
If the terminating condition is not satisfied, Go back to (1). 如果不满足终止条件，请返回（1）。

But, in this case, 但是，在这种情况下，
(You may have noticed) You would be missing samples from the input while you are doing the processing because during that time, you are not recording anything. （您可能已经注意到）在您进行处理时，您将丢失输入中的样本，因为在此期间，您没有记录任何内容。
Depending on your application, you could get away with that...This is especially true for PyAudio because for the moment it only supports blocking-mode so if you want real-time (ish) operation you would have to use threads. 根据你的应用程序，你可以逃脱...对于PyAudio尤其如此，因为目前它只支持阻塞模式，所以如果你想要实时（ish）操作，你将不得不使用线程。

If your real-time specifications are more strict (ie you can't afford to lose even a few samples from your input) you would still use the "record-process-[play]" routine but this time you would need to do it in a Thread and have it communicating with your main process through a LIFO stack (Last In First Out or Deque ). 如果您的实时规格更严格（即您输入的几个样本都不能丢失），您仍然会使用“记录 - 处理 - [播放]”例程，但这次您需要这样做在一个线程中，让它通过LIFO堆栈（Last In First Out或Deque ）与主进程通信。

It would go something like this: 它会是这样的：

Recording Thread: 录音线程：

Record a buffer 记录缓冲区
Push the data on the Deque 推送Deque上的数据
Repeat from (1) 重复（1）

Main Process: 主要流程：

If the Deque has buffers then 如果Deque有缓冲区那么
1. Pull a buffer from the Deque 从Deque拉出一个缓冲区
2. Process it 处理它
Repeat from (1) 重复（1）

In this way, your processing can go on at its own pace while the recording thread keeps filling up buffers and pushing them on the Deque. 通过这种方式，您的处理可以按照自己的节奏继续进行，同时记录线程不断填充缓冲区并将其推送到Deque上。

The good news in the case of Python is that the Deque is thread safe, so you will not have any sync problems when your main process and thread try to access the Deque simultaneously. 对于Python来说，好消息是Deque是线程安全的，因此当主进程和线程同时尝试访问Deque时，您不会遇到任何同步问题。

Again, Depending on your application you might also need to move towards faster hardware such as those that are based on the ASIO protocol . 同样，根据您的应用程序，您可能还需要转向更快的硬件，例如基于ASIO协议的硬件。

Eventually, 最终，
You will also need to modify your processing algorithms a little bit to take into account that you are now working with frames instead of one buffer...Therefore, to keep things smooth you would have to save the state of your operations from one frame to the next. 您还需要稍微修改一下处理算法，以考虑到您现在正在处理帧而不是一个缓冲区...因此，为了保持平稳，您必须将操作状态从一帧保存到下一个。 For more information you can see the "overlap-add" method 有关更多信息，您可以看到“重叠添加”方法

All the best 祝一切顺利