简体   繁体   English

双耳音频的头部相关脉冲响应

[英]Head related impulse response for binaural audio

I am working with audio digital signal processing and binaural audio processing.我正在处理音频数字信号处理和双耳音频处理。 I am still learning the basics.我还在学习基础知识。 Right now, the idea is to do deconvolution and get an impulse response.现在,想法是进行反卷积并获得脉冲响应。

Please see the attached screenshot请参阅随附的屏幕截图在此处输入图像描述

Detailed description of what is happening:正在发生的事情的详细描述:

Here, an exponential sweep signal is taken and played back back through loudspeaker.在这里,指数扫描信号被获取并通过扬声器回放。 The playback is recorded using microphone.使用麦克风录制回放。 The recorded signal is extended using zero padding(probably double the original length) and the original exponential sweep signal is also extended as well.使用零填充(可能是原始长度的两倍)扩展记录的信号,并且原始指数扫描信号也被扩展。 FFTs are taken for both (extended recorded and the extended original), their FFT's are divided and we get room transfer function.两者都采用 FFT(扩展记录和扩展原始),它们的 FFT 被划分,我们得到房间转移 function。 Finally,Inverse FFT is taken and some windowing is performed to get Impulse response.最后,采用逆 FFT 并执行一些加窗以获得脉冲响应。

My question:我的问题:

I am having difficulty implementing this diagram in python.我很难在 python 中实现这个图。 How would you divide two FFT's?你将如何划分两个 FFT? Is it possible?可能吗? I can probably do all steps like zero padding and fft's, but I guess I am not going the correct way.我可能可以执行所有步骤,例如零填充和 fft,但我想我没有走正确的路。 I do not understand the windowing and discarding second half option.我不明白窗口和丢弃后半部分选项。

Please can anyone with his/her knowledge show me how would I implement this in python with sweep signal?请任何有他/她知识的人告诉我如何在 python 中使用扫描信号实现这个? Just a small example would also help to get an idea with few plots.只是一个小例子也有助于用很少的情节获得一个想法。 Please help.请帮忙。

Source of this image: http://www.four-audio.com/data/MF/aes-swp-english.pdf图片来源: http://www.four-audio.com/data/MF/aes-swp-english.pdf

Thanks in advance, Sanket Jain提前致谢, Sanket Jain

This is a little over my head, but maybe the following bits of advice can help.这有点过头了,但也许以下一些建议会有所帮助。

First, I came across a very helpful amount of sample code presented in Steve Smith's book The Scientist and Engineer's Guide to Digital Signal Processing .首先,我在 Steve Smith 的书The Scientist and Engineer's Guide to Digital Signal Processing中看到了非常有用的示例代码。 This includes a range operations, from basics of convolution to the FFT algorithm itself.这包括范围操作,从卷积基础到 FFT 算法本身。 The sample code is in BASIC, not Python.示例代码是 BASIC,而不是 Python。 But the BASIC is perfectly readable, and should be easy to translate.但是 BASIC 是完全可读的,应该很容易翻译。

I'm not entirely sure about the specific calculation you describe, but many operations in this realm (when dealing with multiple signals) turn out to simply employ addition or subtraction of constituent elements.我不完全确定您描述的具体计算,但是这个 realm 中的许多操作(在处理多个信号时)结果只是简单地使用了构成元素的加法或减法。 To get an authoritative answer, I think you will have better luck at Stack Overflow's Signal Processing forum or at one of the forums at DSP Related .要获得权威答案,我认为您将在 Stack Overflow 的信号处理论坛或DSP 相关的论坛之一上获得更好的运气。

If you do get an answer elsewhere, it might be good to either recap it here or delete this question entirely to reduce clutter.如果您确实在其他地方得到了答案,最好在这里回顾一下或完全删除这个问题以减少混乱。

Yes, deviding two FFT-spectra is possible and actually quite easy to implement in python (but with some caveats).是的,在 python 中划分两个 FFT 光谱是可能的并且实际上很容易实现(但有一些警告)。 Simply said: As convolution of two time signal corresponds to multiplying their spectra, vice versa the deconvolution can be realized by dividing the spectra.简单地说:由于两个时间信号的卷积对应于将它们的光谱相乘,反之亦然,可以通过划分光谱来实现反卷积。

Here is an example for a simple deconvolution with numpy:下面是一个使用 numpy 进行简单反卷积的示例:

( x is your excitation sweep signal and y is the recorded sweep signal, from which you want to obtain the impulse response.) x是您的激励扫描信号, y是记录的扫描信号,您希望从中获得脉冲响应。)

import numpy as np
from numpy.fft import rfft, irfft

# define length of FFT (zero padding): at least double length of input
input_length = np.size(x)
n = np.ceil(np.log2(input_length)) + 1
N_fft = int(pow(2, n))

# transform 
# real fft: N real input -> N/2+1 complex output (single sided spectrum)
# real ifft: N/2+1 complex input -> N real output
X_f = rfft(x, N_fft)
Y_f = rfft(x, N_fft)

# deconvolve
H = Y_f / X_f

# backward transform
h = irfft(H, N_fft)

# truncate to original length
h = h[:input_length]

This simple solution is a practical one but can (and should be) be improved.这个简单的解决方案是一个实用的解决方案,但可以(并且应该)改进。 A problem is that you will get a boost of the noise floor at those frequencies where X_f has a low amplitude.一个问题是,在 X_f 具有低幅度的那些频率下,您将获得本底噪声的提升。 For example if your exponential sine sweep starts at 100Hz, for the frequency bins below that frequency, you get a division of (almost) zero.例如,如果您的指数正弦扫描从 100Hz 开始,对于低于该频率的频率区间,您会得到(几乎)零的除法。 One simple possible solution to that is to first invert X_f, apply a bandlimit filter (highpass+lowpass) to remove the "boost areas" and then multiply it with Y_f:一个简单的可能解决方案是首先反转 X_f,应用带限滤波器(高通+低通)以移除“增强区域”,然后将其与 Y_f 相乘:

# deconvolve
Xinv_f = 1 / X_f
Xinv_f = Xinv_f * bandlimit_filter
H = Y_f * Xinv_f

Regarding the distortion : A nice property of the exponential sine sweep is that harmonic distortion production during the measurement (eg by nonlinearities in the loudpspeaker) will produce smaller "side" responses before the "main" response after deconvolution (see this for more details).关于失真:指数正弦扫描的一个很好的特性是测量期间产生的谐波失真(例如,扬声器中的非线性)将在反卷积后的“主”响应之前产生较小的“侧”响应(有关更多详细信息,请参阅) . These side responses are the distortion products and can be simply removed by a time window.这些侧面响应是失真产物,可以通过时间 window 简单地消除。 If there is no delay of the "main" response (starts at t=0), those side responses will appear at the end of the whole iFFT, so you remove them by windowing out the second half.如果“主要”响应没有延迟(从 t=0 开始),这些侧面响应将出现在整个 iFFT 的末尾,因此您可以通过将后半部分窗口化来删除它们。

I cannot guarantee that this is 100% correct from a signal-theory point of view, but I think it shows the point and it works;)从信号理论的角度来看,我不能保证这是 100% 正确的,但我认为它说明了这一点并且有效;)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM