简体   繁体   English

为什么librosa图与matplotlib和audacity不同

[英]Why does librosa plot differ from matplotlib and audacity

I am reading pcm data from a file and then plotting it. 我正在从文件中读取pcm数据,然后将其绘制。 Ive noticed that the plot varies between librosa.display.waveplot, plot and audacity. 我已经注意到,情节在librosa.display.waveplot,情节和大胆之间变化。

Here is the code and images 这是代码和图像

%matplotlib inline
import matplotlib.pyplot as plt
import librosa.display
import numpy as np
import IPython.display as ipd
import matplotlib.pyplot as plt
import numpy, pylab

# the pcm file is 32le integer with a sampling rate of 16KHz
pcm_data = np.fromfile('someaudio.pcm', dtype=np.int32)

# the sample has the same sound as audacity
ipd.Audio(data=pcm_data, rate=16000) 

# all of these give the same resulting plot
plt.figure()
plt.subplot(3, 1, 1)
#librosa.display.waveplot(pcm_data, sr=16000)
#librosa.display.waveplot(pcm_data.astype('double'), sr=16000)
librosa.display.waveplot(pcm_data.astype('float'), max_points=None, sr=16000, max_sr=16000)

This result looks like 这个结果看起来像 在此处输入图片说明

# alternatively plot via matplotlib
pylab.plot(pcm_data)
pylab.show()

This result looks like 这个结果看起来像 在此处输入图片说明

The result from matplotlib looks like audacity matplotlib的结果看起来很胆大 在此处输入图片说明

matplotlib and Audacity show the actual signal samples, which apparently are all negative in the second half of the recording. matplotlibAudacity显示了实际的信号样本,显然在记录的后半段全部为负。

librosa on the other hand shows an envelope of the absolute signal as explained in its documentation : 另一方面, librosa显示了绝对信号的包络,如其文档中所述

Plot the amplitude envelope of a waveform. 绘制波形的幅度包络线。

If y is monophonic, a filled curve is drawn between [-abs(y), abs(y)] . 如果y是单声道的,则在[-abs(y), abs(y)]之间绘制一条填充曲线。

y is the signal in this case. 在这种情况下, y是信号。

This effectively leads to a mirroring effect along the x-axis, which is why the librosa plot is symmetrical. 有效地导致沿x轴的镜像效果,这就是librosa图对称的原因。 matplotlib and Audacity apparently do no such thing. matplotlibAudacity显然没有这样做。

One might argue, that librosa's behavior effectively hides asymmetric waveforms (ie, the amplitude of positive and negative samples is not similar), which are possible in the wild. 有人可能会争辩说, librosa的行为有效地隐藏了非对称波形 (即,正样本和负样本的幅度不相似),这在野外是可能的。 From soundonsound.com : soundonsound.com

This asymmetry is due mainly to two things, the first being the relative phase relationships between the fundamental and different harmonic components in a harmonically complex signal. 这种不对称主要是由于两件事,第一是谐波复信号中基波分量和不同谐波分量之间的相对相位关系。 In combining different frequency signals with differing phase relationships, the result is often a distinctly asymmetrical waveform, and that waveform asymmetry often changes and evolves over time, too. 在组合具有不同相位关系的不同频率信号时,结果通常是明显不对称的波形,而且波形的不对称性也经常随时间变化和发展。 That's just what happens when complex related signals are superimposed. 这就是叠加复杂的相关信号时发生的情况。

One may also argue, that there isn't a lot of useful information in the asymmetry, as humans can usually not perceive it. 也许还会有人争辩说,不对称性中没有很多有用的信息,因为人类通常无法感知。

If you believe librosa's behavior is unexpected or wrong, I recommend filling a bug report, asking for an explanation. 如果您认为librosa的行为是意外的或错误的,我建议您填写错误报告,并寻求解释。

I received some answers via librosa forums. 我通过librosa论坛收到了一些答案。 Here is one answer from Brian McFee: 这是Brian McFee的一个答案:

Following onto what Vincent posted, librosa's wave plot does not show samples directly, for two reasons: 在Vincent发表的内容之后,librosa的波动图未直接显示样本,原因有两个:

  • it would blow up the memory usage by keeping points at a higher resolution than is necessary for visualization 通过将点保持在比可视化所需的分辨率更高的分辨率,将会消耗内存

  • it can be obscured by high frequency noise. 高频噪声会掩盖它。

Instead, librosa's plotter works more like a typical DAW, where the audio signal is down-sampled for viz purposes, and the envelope is visualized rather than the signal itself. 取而代之的是,librosa的绘图仪的工作方式与典型的DAW相似,其中音频信号出于可视化目的而被下采样,并且包络线被可视化而不是信号本身。 These steps are accomplished by showing the max(abs(y[i:i+k])) rather than the samples y[i], y[i+1], ... y[i+k]. 这些步骤是通过显示max(abs(y [i:i + k]))而不是样本y [i],y [i + 1],... y [i + k]完成的。 The length of the downsampling window is controlled by the parameters to waveplot. 下采样窗口的长度由波形图的参数控制。

Since the above- and below-axis information is discarded by taking the abs, we use the axis to separate left and right channels (left above, right below) in stereo signals. 由于上轴和下轴的信息是通过吸收来丢弃的,因此我们使用该轴来分离立体声信号中的左右声道(左上,右下)。 In mono signals, the envelope is reflected across the y axis, which produces the symmetric figure you reported. 在单声道信号中,包络线沿y轴反射,从而产生您报告的对称图形。

Different DAWs will do these steps slightly differently, and a fancy implementation would revert down to sample plotting once you've zoomed in to a range such that doing so becomes feasible. 不同的DAW在执行这些步骤时会稍有不同,一旦放大到一定范围,可行的实现将还原为样图。 Matplotlib doesn't make this entirely easy to pull off, so we opted for this compromise here. Matplotlib并不容易做到这一点,因此我们在这里选择了这种折衷方案。 If you want sample-accurate plotting, we suggest to use pyplot.plt() instead. 如果您想要样本精确的绘图,建议您使用pyplot.plt()。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM