如何根据音频进行直方图均衡？

Question

I have already tried histogram equalization based on image and it works just fine.我已经尝试过基于图像的直方图均衡化，效果很好。

But now I want to implement the same approach using audio frequency instead of image gray scale.但是现在我想使用音频而不是图像灰度来实现相同的方法。 Which means I would like to make the spectrum flatter.这意味着我想让频谱更平坦。 The sampling rate I use is 44.1kHz and want to make the frequency evenly spread to range 0-22050Hz, but the peak is still the highest.我使用的采样率是44.1kHz，想让频率均匀分布到0-22050Hz的范围内，但是峰值还是最高的。

Here is the spectrum:这是光谱：

And this is what I have tried:这就是我尝试过的：

I think the original histogram I plot is already wrong, I can't count the number of occurrences per frequency, or maybe I shouldn't do this at all.我认为原始直方图 I plot 已经错了，我无法计算每个频率的出现次数，或者我根本不应该这样做。 Somebody told me I need to use fft() but I have no idea how to do it.有人告诉我我需要使用 fft() 但我不知道该怎么做。

Any help would be appreciated!任何帮助，将不胜感激！ Thanks谢谢

Here is the code for how I plot the spectrum:这是我如何 plot 频谱的代码：

import librosa
import numpy as np
import matplotlib.pyplot as plt
import math

file = 'example.wav'
y, sr = librosa.load(file, sr=None)

n_fft = 2048
S = librosa.stft(y, n_fft=n_fft, hop_length=n_fft//2)

S = abs(S)
D_AVG = np.mean(S, axis=1)

plt.figure(figsize=(25, 12))

plt.bar(np.arange(D_AVG.shape[0]), D_AVG)
x_ticks_positions = [n for n in range(0, n_fft // 2, n_fft // 16)]
x_ticks_labels = [str(sr / 2048 * n) + 'Hz' for n in x_ticks_positions]
plt.xticks(x_ticks_positions, x_ticks_labels)
plt.xlabel('Frequency')
plt.ylabel('dB')
plt.savefig('spectrum.png')

Answer 1

"Equalization" in the sense of making a flat frequency spectrum is usually done by a whitening transformation .使频谱平坦的意义上的“均衡”通常是通过白化变换来完成的。 This post on dsp.stackexchange might also be helpful. dsp.stackexchange 上的这篇文章也可能会有所帮助。 As Mark mentioned, this spectral equalization is different from histogram equalization in image processing.正如 Mark 所提到的，这种光谱均衡与图像处理中的直方图均衡不同。

Equalizing/whitening the spectrum of a signal:均衡/白化信号频谱：

Estimate the PSD.估计 PSD。 Given an array of samples x with sample rate fs , you can compute a robust estimate of the power spectral density (PSD) with scipy.signal.welch :给定具有采样率fs的样本数组x ，您可以使用scipy.signal.welch计算功率谱密度 (PSD) 的稳健估计：
```
 f, psd = scipy.signal.welch(x, fs=fs)
```
This function performs Welch's method .这个 function 执行Welch 的方法。 Basically, it divides up the signal into several segments, does FFT on each one, and averages the power spectra to get a good estimate of how much power x has at each frequency on average.基本上，它将信号分成几个段，对每个段进行 FFT，然后对功率谱进行平均，以很好地估计x在每个频率上平均有多少功率。 The point of all this is it gets a more reliable frequency characterization than just taking one FFT of x as a whole.所有这一切的关键在于，它比仅将x的一个 FFT 作为一个整体获得更可靠的频率表征。
Compute equalizer gain.计算均衡器增益。 Use eq_gain = 1 / (1e-6 + psd)**0.5 , or something similar, to determine the gain of the equalizer.使用eq_gain = 1 / (1e-6 + psd)**0.5或类似的东西来确定均衡器的增益。 The 1e-6 denominator offset is to avoid division by zero. 1e-6分母偏移是为了避免被零除。 It often happens that the PSD extremely small for some frequencies because, say, x went through an anti-aliasing filter that made some high frequency powers nearly zero.通常情况下，某些频率的 PSD 非常小，因为例如x通过了一个抗混叠滤波器，该滤波器使某些高频功率几乎为零。
Apply the equalizer gain.应用均衡器增益。 Finally, eq_gain needs to be applied to the signal x to equalize it.最后，需要将eq_gain应用于信号x以使其均衡。 There are many ways this could be done, but one way is to use scipy.signal.firwin2 to turn the gains into an FIR filter,有很多方法可以做到这一点，但一种方法是使用scipy.signal.firwin2将增益转换为 FIR 滤波器，
```
 eq_filter = scipy.signal.firwin2(99, f, eq_gain, fs=fs)
```
and use a convolution or scipy.signal.lfilter to apply the filter to x .并使用卷积或scipy.signal.lfilter将过滤器应用于x 。 You can then use scipy.signal.welch again to check that the PSD is flatter than before.然后，您可以再次使用scipy.signal.welch来检查 PSD 是否比以前更平坦。

如何根据音频进行直方图均衡？

问题描述

1 个解决方案

解决方案1
1 2020-12-14 05:26:20

如何根据音频进行直方图均衡？

问题描述

1 个解决方案

解决方案1 1 2020-12-14 05:26:20

解决方案1
1 2020-12-14 05:26:20