如何在 python 和 ffmpeg 或类似工具中读取实时麦克风音频音量

Question

I'm trying to read, in near-realtime , the volume coming from the audio of a USB microphone in Python.我正在尝试近乎实时地读取来自 Python 中 USB 麦克风音频的音量。

I have the pieces, but can't figure out how to put it together.我有碎片，但不知道如何将它们组合在一起。

If I already have a .wav file, I can pretty simply read it using wavefile :如果我已经有一个 .wav 文件，我可以很简单地使用wavefile读取它：

from wavefile import WaveReader

with WaveReader("/Users/rmartin/audio.wav") as r:
    for data in r.read_iter(size=512):
        left_channel = data[0]
        volume = np.linalg.norm(left_channel)
        print volume

This works great, but I want to process the audio from the microphone in real-time, not from a file.这很好用，但我想实时处理来自麦克风的音频，而不是来自文件。

So my thought was to use something like ffmpeg to PIPE the real-time output into WaveReader, but my Byte knowledge is somewhat lacking.所以我的想法是使用类似ffmpeg的东西将实时输出PIPE到WaveReader中，但我的Byte知识有点缺乏。

import subprocess
import numpy as np

command = ["/usr/local/bin/ffmpeg",
            '-f', 'avfoundation',
            '-i', ':2',
            '-t', '5',
            '-ar', '11025',
            '-ac', '1',
            '-acodec','aac', '-']

pipe = subprocess.Popen(command, stdout=subprocess.PIPE, bufsize=10**8)
stdout_data = pipe.stdout.read()
audio_array = np.fromstring(stdout_data, dtype="int16")

print audio_array

That looks pretty, but it doesn't do much.这看起来很漂亮，但作用不大。 It fails with a [NULL @ 0x7ff640016600] Unable to find a suitable output format for 'pipe:' error.它因[NULL @ 0x7ff640016600] 无法为“管道：”错误找到合适的输出格式而失败。

I assume this is a fairly simple thing to do given that I only need to check the audio for volume levels.我认为这是一件相当简单的事情，因为我只需要检查音频的音量。

Anyone know how to accomplish this simply?有谁知道如何简单地做到这一点？ FFMPEG isn't a requirement, but it does need to work on OSX & Linux. FFMPEG 不是必需的，但它确实需要在 OSX 和 Linux 上运行。

Answer 1

Thanks to @Matthias for the suggestion to use the sounddevice module.感谢@Matthias 建议使用 sounddevice 模块。 It's exactly what I need.这正是我所需要的。

For posterity, here is a working example that prints real-time audio levels to the shell:对于后人，这里是一个将实时音频电平打印到 shell 的工作示例：

# Print out realtime audio volume as ascii bars

import sounddevice as sd
import numpy as np

def print_sound(indata, outdata, frames, time, status):
    volume_norm = np.linalg.norm(indata)*10
    print ("|" * int(volume_norm))

with sd.Stream(callback=print_sound):
    sd.sleep(10000)

Answer 2

Python 3 user here Python 3 用户在这里
I had few problems to make that work so I used: https://python-sounddevice.readthedocs.io/en/0.3.3/examples.html#plot-microphone-signal-s-in-real-time我在做这项工作时遇到的问题很少，所以我使用了： https : //python-sounddevice.readthedocs.io/en/0.3.3/examples.html#plot-microphone-signal-s-in-real-time
And I need to install sudo apt-get install python3-tk for python 3.6 look Tkinter module not found on Ubuntu我需要安装sudo apt-get install python3-tk for python 3.6 look Tkinter module not found on Ubuntu
Then I modified script:然后我修改了脚本：

#!/usr/bin/env python3
import numpy as np
import sounddevice as sd

duration = 10 #in seconds

def audio_callback(indata, frames, time, status):
   volume_norm = np.linalg.norm(indata) * 10
   print("|" * int(volume_norm))


stream = sd.InputStream(callback=audio_callback)
with stream:
   sd.sleep(duration * 1000)

And yes it working :)是的，它有效:)

如何在 python 和 ffmpeg 或类似工具中读取实时麦克风音频音量

问题描述

2 个解决方案

解决方案1
30 已采纳 2016-10-20 14:28:53

解决方案2
7 2018-02-12 13:42:52

如何在 python 和 ffmpeg 或类似工具中读取实时麦克风音频音量

问题描述

2 个解决方案

解决方案1 30 已采纳 2016-10-20 14:28:53

解决方案2 7 2018-02-12 13:42:52

解决方案1
30 已采纳 2016-10-20 14:28:53

解决方案2
7 2018-02-12 13:42:52