简体   繁体   English

Naudio 生成的音频波形与某些 .wav 文件和 .m4a 文件的声音不匹配

[英]Naudio generated audio wave form doesn't match the sound for some .wav file and .m4a file

I tried to develop a tool for subtitling of audio and use Naudio to generate wave form for user to identify the sound, each audio is around 1 hour and I find for some audio the wave from doesn't match sound from the middle of audio.我尝试开发一个用于音频字幕的工具,并使用 Naudio 生成波形供用户识别声音,每个音频大约 1 小时,我发现某些音频的波形与音频中间的声音不匹配。

Here is the code这是代码

public static class WaveFormRendererTool
    {
        public static void draw(int width, string filename)
        {
            string imagepath = filename+".png";
            var maxPeakProvider = new MaxPeakProvider();
            var rmsPeakProvider = new RmsPeakProvider(200); // e.g. 200
            var samplingPeakProvider = new SamplingPeakProvider(200); // e.g. 200
            var averagePeakProvider = new AveragePeakProvider(4); // e.g. 4

            SolidBrush brush = new SolidBrush(Color.Green);

            var myRendererSettings = new StandardWaveFormRendererSettings();
            //var myRendererSettings = new SoundCloudBlockWaveFormSettings(Color.Red,Color.Green,Color.Yellow,Color.Blue);
            myRendererSettings.Width = width;
            myRendererSettings.TopHeight = 75;
            myRendererSettings.BottomHeight = 75;
            myRendererSettings.BackgroundColor = Color.White;
            myRendererSettings.PixelsPerPeak = 1;
            myRendererSettings.TopPeakPen = new Pen(brush);
            myRendererSettings.BottomPeakPen = new Pen(brush);
            myRendererSettings.TopSpacerPen = new Pen(brush);
            



            
            var renderer = new WaveFormRenderer();
            var audioFilePath = filename;
            var image = renderer.Render(audioFilePath, averagePeakProvider, myRendererSettings);
           /* if (File.Exists(imagepath)) {
                File.Delete(imagepath);
            }*/
            image.Save(imagepath, ImageFormat.Png);
            renderer=null;
        }

the wave form is flat but already speak波形是平坦的,但已经说话了

Here are code for width:这是宽度的代码:


                MediaFoundationReader wf = new MediaFoundationReader(file.FullName);
                audioLength = wf.TotalTime.TotalSeconds;
                int width = Convert.ToInt32(wf.TotalTime.TotalSeconds*10);
                
                canvas.Width = width * canstf.ScaleX;
                canvaswidth = canvas.Width;
                canvas.Height = 150;
                img.Width = width;
                //img.Height = 100;
                //img.Height = 100

                img.Source = null;
                WaveFormRendererTool.draw(width, file.FullName);
                img.Source = ImageRotation.LoadImageFile(file.FullName + ".png");
                scroller1.ScrollToHorizontalOffset(0);
                initialCanvas(width);

Without having access to your specific source file we can't reproduce your problem, especially since I have no idea what value width has, so this is going to be a bit of guesswork.如果无法访问您的特定源文件,我们就无法重现您的问题,尤其是因为我不知道width有什么值,所以这将是一些猜测。 Depending on the audio file, and assuming you're using Mark's NAudio.WaveFormRenderer code, this could be entirely accurate.根据音频文件,并假设您使用 Mark 的NAudio.WaveFormRenderer代码,这可能是完全准确的。

To be honest though, from the code and the stretched out image, I'm not certain that you've got the image synched up correctly with the timestamp.老实说,从代码和拉伸的图像来看,我不确定您是否已将图像与时间戳正确同步。 If you're getting similar results with the 4 different PeakProvider variants you're initializing then it's almost certain that you've got something wrong in your scaling.如果您在初始化的 4 个不同的PeakProvider变体中得到类似的结果,那么几乎可以肯定您的缩放有问题。

Unfortunately you haven't provided the actual display code, so I can't point out where the error might be.不幸的是,您没有提供实际的显示代码,所以我无法指出错误可能在哪里。 You need to go back and cross-check the code that maps time onto the width of the rendered waveform image.您需要返回并交叉检查将时间映射到渲染波形图像宽度的代码。


I played with the file over the weekend and I'm reasonably certain that the main problem is a disconnect between your calculations and the ones done by the WaveFormRenderer class, and that's leading to drift over large time periods.我在周末玩了这个文件,我有理由确定主要问题是您的计算与WaveFormRenderer类完成的计算之间存在脱节,这导致了长时间的漂移。

Here's the code used by WaveFormRenderer to determine how many samples it will use per bar: 以下是WaveFormRenderer用于确定每个柱将使用多少样本的代码

int bytesPerSample = (reader.WaveFormat.BitsPerSample / 8);
var samples = reader.Length / (bytesPerSample);
var samplesPerPixel = (int)(samples / settings.Width);
var stepSize = settings.PixelsPerPeak + settings.SpacerPixels;
peakProvider.Init(reader, samplesPerPixel * stepSize);

Invoking this with your code, samplesPerPixel truncates to 1599 - a fraction under the number of samples for a 1/10th of a second.使用您的代码调用它, samplesPerPixel截断为 1599 - 1/10 秒内样本数的一小部分。 So instead of getting 0.1 seconds per pixel in the rendered image you're getting 0.0999375 seconds per pixel.因此,渲染图像中的每个像素不是 0.1 秒,而是每个像素 0.0999375 秒。 At 00:24:14 in the file (where your screenshot appears to be) the accumulated drift is ~0.91 seconds.在文件中的 00:24:14(您的屏幕截图所在的位置),累积漂移约为 0.91 秒。

Fortunately the fix is simple, if a little counter-intuitive with all the talk of rounding errors and such: use truncation rather than rounding when calculating width:幸运的是,修复很简单,如果所有关于舍入错误的讨论有点违反直觉,例如:在计算宽度时使用截断而不是舍入:

int width = (int)(wf.TotalTime.TotalSeconds*10);

This should guarantee that the samplesPerPixel calculation will always come out to 1600 rather than 1599, allowing you to sync your playback without the drift.这应该可以保证samplesPerPixel计算结果始终为 1600 而不是 1599,从而允许您同步播放而不会出现漂移。 This is certainly simpler than trying to re-tool everything in your code to adjust to a very slight per-pixel drift.这肯定比尝试重新调整代码中的所有内容以适应非常轻微的每像素漂移更简单。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM