[英]Real-time audio signal processing using python
I have been trying to do real-time audio signal processing using 'pyAudio' module in python. 我一直在尝试使用python中的'pyAudio'模块进行实时音频信号处理。 What I did was a simple case of reading audio data from microphone and play it via headphones.
我所做的是从麦克风读取音频数据并通过耳机播放的简单案例。 I tried with the following code(both Python and Cython versions).
我尝试使用以下代码(Python和Cython版本)。 Thought it works but unfortunately it is stalls and not smooth enough.
认为它的工作原理,但不幸的是它是摊位,不够平稳。 How can I improve the code so that it will run smoothly.
如何改进代码以使其顺利运行。 My PC is i7, 8GB RAM.
我的电脑是i7,8GB RAM。
Python Version Python版本
import pyaudio
import numpy as np
RATE = 16000
CHUNK = 256
p = pyaudio.PyAudio()
player = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, output=True,
frames_per_buffer=CHUNK)
stream = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, input=True, frames_per_buffer=CHUNK)
for i in range(int(20*RATE/CHUNK)): #do this for 10 seconds
player.write(np.fromstring(stream.read(CHUNK),dtype=np.int16))
stream.stop_stream()
stream.close()
p.terminate()
Cython Version Cython版本
import pyaudio
import numpy as np
cdef int RATE = 16000
cdef int CHUNK = 1024
cdef int i
p = pyaudio.PyAudio()
player = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, output=True, frames_per_buffer=CHUNK)
stream = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, input=True, frames_per_buffer=CHUNK)
for i in range(500): #do this for 10 seconds
player.write(np.fromstring(stream.read(CHUNK),dtype=np.int16))
stream.stop_stream()
stream.close()
p.terminate()
I believe you are missing CHUNK
as second argument to player.write
call. 我相信你缺少
CHUNK
作为player.write
调用的第二个参数。
player.write(np.fromstring(stream.read(CHUNK),dtype=np.int16),CHUNK)
Also, not sure if its formatting error. 此外,不确定其格式错误。 But
player.write
needs to be tabbed into for
loop 但是需要将
player.write
为for
循环
And per pyaudio site you need to have RATE / CHUNK * RECORD_SECONDS
and not RECORD *RATE/CHUNK
as python
executes *
multiplication before /
division. 并且每个pyaudio站点你需要有
RATE / CHUNK * RECORD_SECONDS
而不是RECORD *RATE/CHUNK
因为python
在/
division之前执行*
multiplication。
for i in range(int(20*RATE/CHUNK)): #do this for 10 seconds
player.write(np.fromstring(stream.read(CHUNK),dtype=np.int16),CHUNK)
stream.stop_stream()
stream.close()
p.terminate()
Finally, you may want to increase rate
to 44100
, CHUNK
to 1024
and CHANNEL
to 2
for better fidelity. 最后,您可能希望将
rate
提高到44100
,将CHUNK
到1024
,将CHANNEL
到2
以获得更高的保真度。
The code below will take the default input device, and output what's recorded into the default output device. 下面的代码将采用默认输入设备,并输出记录到默认输出设备中的内容。
import PyAudio
import numpy as np
p = pyaudio.PyAudio()
CHANNELS = 2
RATE = 44100
def callback(in_data, frame_count, time_info, flag):
# using Numpy to convert to array for processing
# audio_data = np.fromstring(in_data, dtype=np.float32)
return in_data, pyaudio.paContinue
stream = p.open(format=pyaudio.paFloat32,
channels=CHANNELS,
rate=RATE,
output=True,
input=True,
stream_callback=callback)
stream.start_stream()
while stream.is_active():
time.sleep(20)
stream.stop_stream()
print("Stream is stopped")
stream.close()
p.terminate()
This will run for 20 seconds and stop. 这将运行20秒并停止。 The method callback is where you can process the signal :
audio_data = np.fromstring(in_data, dtype=np.float32)
方法回调是您可以处理信号的地方:
audio_data = np.fromstring(in_data, dtype=np.float32)
return in_data
is where you send back post-processed data to the output device. return in_data
是将后处理数据发送回输出设备的位置。
Note chunk has a default argument of 1024 as noted in the PyAudio docs: http://people.csail.mit.edu/hubert/pyaudio/docs/#pyaudio.PyAudio.open 注意chunk的默认参数为1024,如PyAudio docs中所述: http ://people.csail.mit.edu/hubert/pyaudio/docs/#pyaudio.PyAudio.open
I am working on a similar project. 我正在做一个类似的项目。 I modified your code and the stalls now are gone.
我修改了你的代码,现在摊位都没了。 The bigger the chunk the bigger the delay.
块越大,延迟越大。 That is why I kept it low.
这就是我保持低调的原因。
import pyaudio
import numpy as np
CHUNK = 2**5
RATE = 44100
LEN = 10
p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, input=True, frames_per_buffer=CHUNK)
player = p.open(format=pyaudio.paInt16, channels=1, rate=RATE, output=True, frames_per_buffer=CHUNK)
for i in range(int(LEN*RATE/CHUNK)): #go for a LEN seconds
data = np.fromstring(stream.read(CHUNK),dtype=np.int16)
player.write(data,CHUNK)
stream.stop_stream()
stream.close()
p.terminate()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.