简体   繁体   中英

Naudio generated audio wave form doesn't match the sound for some .wav file and .m4a file

I tried to develop a tool for subtitling of audio and use Naudio to generate wave form for user to identify the sound, each audio is around 1 hour and I find for some audio the wave from doesn't match sound from the middle of audio.

Here is the code

public static class WaveFormRendererTool
    {
        public static void draw(int width, string filename)
        {
            string imagepath = filename+".png";
            var maxPeakProvider = new MaxPeakProvider();
            var rmsPeakProvider = new RmsPeakProvider(200); // e.g. 200
            var samplingPeakProvider = new SamplingPeakProvider(200); // e.g. 200
            var averagePeakProvider = new AveragePeakProvider(4); // e.g. 4

            SolidBrush brush = new SolidBrush(Color.Green);

            var myRendererSettings = new StandardWaveFormRendererSettings();
            //var myRendererSettings = new SoundCloudBlockWaveFormSettings(Color.Red,Color.Green,Color.Yellow,Color.Blue);
            myRendererSettings.Width = width;
            myRendererSettings.TopHeight = 75;
            myRendererSettings.BottomHeight = 75;
            myRendererSettings.BackgroundColor = Color.White;
            myRendererSettings.PixelsPerPeak = 1;
            myRendererSettings.TopPeakPen = new Pen(brush);
            myRendererSettings.BottomPeakPen = new Pen(brush);
            myRendererSettings.TopSpacerPen = new Pen(brush);
            



            
            var renderer = new WaveFormRenderer();
            var audioFilePath = filename;
            var image = renderer.Render(audioFilePath, averagePeakProvider, myRendererSettings);
           /* if (File.Exists(imagepath)) {
                File.Delete(imagepath);
            }*/
            image.Save(imagepath, ImageFormat.Png);
            renderer=null;
        }

the wave form is flat but already speak

Here are code for width:


                MediaFoundationReader wf = new MediaFoundationReader(file.FullName);
                audioLength = wf.TotalTime.TotalSeconds;
                int width = Convert.ToInt32(wf.TotalTime.TotalSeconds*10);
                
                canvas.Width = width * canstf.ScaleX;
                canvaswidth = canvas.Width;
                canvas.Height = 150;
                img.Width = width;
                //img.Height = 100;
                //img.Height = 100

                img.Source = null;
                WaveFormRendererTool.draw(width, file.FullName);
                img.Source = ImageRotation.LoadImageFile(file.FullName + ".png");
                scroller1.ScrollToHorizontalOffset(0);
                initialCanvas(width);

Without having access to your specific source file we can't reproduce your problem, especially since I have no idea what value width has, so this is going to be a bit of guesswork. Depending on the audio file, and assuming you're using Mark's NAudio.WaveFormRenderer code, this could be entirely accurate.

To be honest though, from the code and the stretched out image, I'm not certain that you've got the image synched up correctly with the timestamp. If you're getting similar results with the 4 different PeakProvider variants you're initializing then it's almost certain that you've got something wrong in your scaling.

Unfortunately you haven't provided the actual display code, so I can't point out where the error might be. You need to go back and cross-check the code that maps time onto the width of the rendered waveform image.


I played with the file over the weekend and I'm reasonably certain that the main problem is a disconnect between your calculations and the ones done by the WaveFormRenderer class, and that's leading to drift over large time periods.

Here's the code used by WaveFormRenderer to determine how many samples it will use per bar:

int bytesPerSample = (reader.WaveFormat.BitsPerSample / 8);
var samples = reader.Length / (bytesPerSample);
var samplesPerPixel = (int)(samples / settings.Width);
var stepSize = settings.PixelsPerPeak + settings.SpacerPixels;
peakProvider.Init(reader, samplesPerPixel * stepSize);

Invoking this with your code, samplesPerPixel truncates to 1599 - a fraction under the number of samples for a 1/10th of a second. So instead of getting 0.1 seconds per pixel in the rendered image you're getting 0.0999375 seconds per pixel. At 00:24:14 in the file (where your screenshot appears to be) the accumulated drift is ~0.91 seconds.

Fortunately the fix is simple, if a little counter-intuitive with all the talk of rounding errors and such: use truncation rather than rounding when calculating width:

int width = (int)(wf.TotalTime.TotalSeconds*10);

This should guarantee that the samplesPerPixel calculation will always come out to 1600 rather than 1599, allowing you to sync your playback without the drift. This is certainly simpler than trying to re-tool everything in your code to adjust to a very slight per-pixel drift.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM