[英]How to do Histogram Equalization based on audio frequency?
I have already tried histogram equalization based on image and it works just fine.我已经尝试过基于图像的直方图均衡化,效果很好。
But now I want to implement the same approach using audio frequency instead of image gray scale.但是现在我想使用音频而不是图像灰度来实现相同的方法。 Which means I would like to make the spectrum flatter.
这意味着我想让频谱更平坦。 The sampling rate I use is 44.1kHz and want to make the frequency evenly spread to range 0-22050Hz, but the peak is still the highest.
我使用的采样率是44.1kHz,想让频率均匀分布到0-22050Hz的范围内,但是峰值还是最高的。
And this is what I have tried:这就是我尝试过的:
I think the original histogram I plot is already wrong, I can't count the number of occurrences per frequency, or maybe I shouldn't do this at all.我认为原始直方图 I plot 已经错了,我无法计算每个频率的出现次数,或者我根本不应该这样做。 Somebody told me I need to use fft() but I have no idea how to do it.
有人告诉我我需要使用 fft() 但我不知道该怎么做。
Any help would be appreciated!任何帮助,将不胜感激! Thanks
谢谢
Here is the code for how I plot the spectrum:这是我如何 plot 频谱的代码:
import librosa
import numpy as np
import matplotlib.pyplot as plt
import math
file = 'example.wav'
y, sr = librosa.load(file, sr=None)
n_fft = 2048
S = librosa.stft(y, n_fft=n_fft, hop_length=n_fft//2)
S = abs(S)
D_AVG = np.mean(S, axis=1)
plt.figure(figsize=(25, 12))
plt.bar(np.arange(D_AVG.shape[0]), D_AVG)
x_ticks_positions = [n for n in range(0, n_fft // 2, n_fft // 16)]
x_ticks_labels = [str(sr / 2048 * n) + 'Hz' for n in x_ticks_positions]
plt.xticks(x_ticks_positions, x_ticks_labels)
plt.xlabel('Frequency')
plt.ylabel('dB')
plt.savefig('spectrum.png')
"Equalization" in the sense of making a flat frequency spectrum is usually done by a whitening transformation .使频谱平坦的意义上的“均衡”通常是通过白化变换来完成的。 This post on dsp.stackexchange might also be helpful.
dsp.stackexchange 上的这篇文章也可能会有所帮助。 As Mark mentioned, this spectral equalization is different from histogram equalization in image processing.
正如 Mark 所提到的,这种光谱均衡与图像处理中的直方图均衡不同。
Equalizing/whitening the spectrum of a signal:均衡/白化信号频谱:
Estimate the PSD.估计 PSD。 Given an array of samples
x
with sample rate fs
, you can compute a robust estimate of the power spectral density (PSD) with scipy.signal.welch :给定具有采样率
fs
的样本数组x
,您可以使用scipy.signal.welch计算功率谱密度 (PSD) 的稳健估计:
f, psd = scipy.signal.welch(x, fs=fs)
This function performs Welch's method .这个 function 执行Welch 的方法。 Basically, it divides up the signal into several segments, does FFT on each one, and averages the power spectra to get a good estimate of how much power
x
has at each frequency on average.基本上,它将信号分成几个段,对每个段进行 FFT,然后对功率谱进行平均,以很好地估计
x
在每个频率上平均有多少功率。 The point of all this is it gets a more reliable frequency characterization than just taking one FFT of x
as a whole.所有这一切的关键在于,它比仅将
x
的一个 FFT 作为一个整体获得更可靠的频率表征。
Compute equalizer gain.计算均衡器增益。 Use
eq_gain = 1 / (1e-6 + psd)**0.5
, or something similar, to determine the gain of the equalizer.使用
eq_gain = 1 / (1e-6 + psd)**0.5
或类似的东西来确定均衡器的增益。 The 1e-6
denominator offset is to avoid division by zero. 1e-6
分母偏移是为了避免被零除。 It often happens that the PSD extremely small for some frequencies because, say, x
went through an anti-aliasing filter that made some high frequency powers nearly zero.通常情况下,某些频率的 PSD 非常小,因为例如
x
通过了一个抗混叠滤波器,该滤波器使某些高频功率几乎为零。
Apply the equalizer gain.应用均衡器增益。 Finally,
eq_gain
needs to be applied to the signal x
to equalize it.最后,需要将
eq_gain
应用于信号x
以使其均衡。 There are many ways this could be done, but one way is to use scipy.signal.firwin2 to turn the gains into an FIR filter,有很多方法可以做到这一点,但一种方法是使用scipy.signal.firwin2将增益转换为 FIR 滤波器,
eq_filter = scipy.signal.firwin2(99, f, eq_gain, fs=fs)
and use a convolution or scipy.signal.lfilter to apply the filter to x
.并使用卷积或scipy.signal.lfilter将过滤器应用于
x
。 You can then use scipy.signal.welch
again to check that the PSD is flatter than before.然后,您可以再次使用
scipy.signal.welch
来检查 PSD 是否比以前更平坦。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.