简体   繁体   中英

Why is my C# program faster in a profiler?

I have a relatively large system (~25000 lines so far) for monitoring radio-related devices. It shows graphs and such using latest version of ZedGraph. The program is coded using C# on VS2010 with Win7. The problem is:

  • when I run the program from within VS, it runs slow
  • when I run the program from the built EXE, it runs slow
  • when I run the program though Performance Wizard / CPU Profiler, it runs Blazing Fast.
  • when I run the program from the built EXE, and then start VS and Attach a profiler to ANY OTHER PROCESS, my program speeds up!

I want the program to always run that fast!

Every project in the solution is set to RELEASE, Debug unmanaged code is DISABLED, Define DEBUG and TRACE constants is DISABLED, Optimize Code - I tried either, Warning Level - I tried either, Suppress JIT - I tried either, in short I tried all the solutions already proposed on StackOverflow - none worked. Program is slow outside profiler, fast in profiler. I don't think the problem is in my code, because it becomes fast if I attach the profiler to other, unrelated process as well!

Please help! I really need it to be that fast everywhere, because it's a business critical application and performance issues are not tolerated...

UPDATES 1 - 8 follow

--------------------Update1:--------------------

The problem seems to Not be ZedGraph related, because it still manifests after I replaced ZedGraph with my own basic drawing.

--------------------Update2:--------------------

Running the program in a Virtual machine, the program still runs slow, and running profiler from the Host machine doesn't make it fast.

--------------------Update3:--------------------

Starting screen capture to video also speeds the program up!

--------------------Update4:--------------------

If I open the Intel graphics driver settings window (this thing: http://www.intel.com/support/graphics/sb/img/resolution_new.jpg ) and just constantly hover with the cursor over buttons, so they glow, etc, my program speeds up!. It doesn't speed up if I run GPUz or Kombustor though, so no downclocking on the GPU - it stays steady 850Mhz.

--------------------Update5:--------------------

Tests on different machines:

-On my Core i5-2400S with Intel HD2000, UI runs slow and CPU usage is ~15%.

-On a colleague's Core 2 Duo with Intel G41 Express, UI runs fast, but CPU usage is ~90% (which isn't normal either)

-On Core i5-2400S with dedicated Radeon X1650, UI runs blazing fast, CPU usage is ~50%.

--------------------Update6:--------------------

A snip of code showing how I update a single graph ( graphFFT is an encapsulation of ZedGraphControl for ease of use):

public void LoopDataRefresh() //executes in a new thread
        {
            while (true)
            {
                while (!d.Connected)
                    Thread.Sleep(1000);
                if (IsDisposed)
                    return;
//... other graphs update here
                if (signalNewFFT && PanelFFT.Visible)
                {
                    signalNewFFT = false;
                    #region FFT
                    bool newRange = false;
                    if (graphFFT.MaxY != d.fftRangeYMax)
                    {
                        graphFFT.MaxY = d.fftRangeYMax;
                        newRange = true;
                    }
                    if (graphFFT.MinY != d.fftRangeYMin)
                    {
                        graphFFT.MinY = d.fftRangeYMin;
                        newRange = true;
                    }

                    List<PointF> points = new List<PointF>(2048);
                    int tempLength = 0;
                    short[] tempData = new short[2048];

                    int i = 0;

                    lock (d.fftDataLock)
                    {
                        tempLength = d.fftLength;
                        tempData = (short[])d.fftData.Clone();
                    }
                    foreach (short s in tempData)
                        points.Add(new PointF(i++, s));

                    graphFFT.SetLine("FFT", points);

                    if (newRange)
                        graphFFT.RefreshGraphComplete();
                    else if (PanelFFT.Visible)
                        graphFFT.RefreshGraph();

                    #endregion
                }
//... other graphs update here
                Thread.Sleep(5);
            }
        }

SetLine is:

public void SetLine(String lineTitle, List<PointF> values)
    {
        IPointListEdit ip = zgcGraph.GraphPane.CurveList[lineTitle].Points as IPointListEdit;
        int tmp = Math.Min(ip.Count, values.Count);
        int i = 0;
        while(i < tmp)
        {
            if (values[i].X > peakX)
                peakX = values[i].X;
            if (values[i].Y > peakY)
                peakY = values[i].Y;
            ip[i].X = values[i].X;
            ip[i].Y = values[i].Y;
            i++;
        }
        while(ip.Count < values.Count)
        {
            if (values[i].X > peakX)
                peakX = values[i].X;
            if (values[i].Y > peakY)
                peakY = values[i].Y;
            ip.Add(values[i].X, values[i].Y);
            i++;
        }
        while(values.Count > ip.Count)
        {
            ip.RemoveAt(ip.Count - 1);
        }
    }

RefreshGraph is:

public void RefreshGraph()
    {
        if (!explicidX && autoScrollFlag)
        {
            zgcGraph.GraphPane.XAxis.Scale.Max = Math.Max(peakX + grace.X, rangeX);
            zgcGraph.GraphPane.XAxis.Scale.Min = zgcGraph.GraphPane.XAxis.Scale.Max - rangeX;
        }
        if (!explicidY)
        {
            zgcGraph.GraphPane.YAxis.Scale.Max = Math.Max(peakY + grace.Y, maxY);
            zgcGraph.GraphPane.YAxis.Scale.Min = minY;
        }
        zgcGraph.Refresh();
    }

.

--------------------Update7:--------------------

Just ran it through the ANTS profiler. It tells me that the ZedGraph refresh counts when the program is fast are precisely two times higher compared to when it's slow. Here are the screenshots: ANTS的屏幕截图很慢ANTS快速截图

I find it VERY strange that, considering the small difference in the length of the sections, performance differs twice with mathematical precision.

Also, I updated the GPU driver, that didn't help.

--------------------Update8:--------------------

Unfortunately, for a few days now, I'm unable to reproduce the issue... I'm getting constant acceptable speed (which still appear a bit slower than what I had in the profiler two weeks ago) which isn't affected by any of the factors that used to affect it two weeks ago - profiler, video capturing or GPU driver window. I still have no explanation of what was causing it...

There are situations when slowing down a thread can speed up other threads significantly, usually when one thread is polling or locking some common resource frequently.

For instance (this is a windows-forms example) when the main thread is checking overall progress in a tight loop instead of using a timer, for example:

private void SomeWork() {
  // start the worker thread here
  while(!PollDone()) {
    progressBar1.Value = PollProgress();
    Application.DoEvents(); // keep the GUI responisive
  }
}

Slowing it down could improve performance:

private void SomeWork() {
  // start the worker thread here
  while(!PollDone()) {
    progressBar1.Value = PollProgress();
    System.Threading.Thread.Sleep(300); // give the polled thread some time to work instead of responding to your poll
    Application.DoEvents(); // keep the GUI responisive
  }
}

Doing it correctly, one should avoid using the DoEvents call alltogether:

private Timer tim = new Timer(){ Interval=300 };

private void SomeWork() {
  // start the worker thread here
  tim.Tick += tim_Tick;
  tim.Start();
}

private void  tim_Tick(object sender, EventArgs e){
  tim.Enabled = false; // prevent timer messages from piling up
  if(PollDone()){
    tim.Tick -= tim_Tick;
    return;
  }
  progressBar1.Value = PollProgress();
  tim.Enabled = true;
}

Calling Application.DoEvents() can potentially cause allot of headaches when GUI stuff has not been disabled and the user kicks off other events or the same event a 2nd time simultaneously, causing stack climbs which by nature queue the first action behind the new one, but I'm going off topic.

Probably that example is too winforms specific, I'll try making a more general example. If you have a thread that is filling a buffer that is processed by other threads, be sure to leave some System.Threading.Thread.Sleep() slack in the loop to allow the other threads to do some processing before checking if the buffer needs to be filled again:

public class WorkItem { 
  // populate with something usefull
}

public static object WorkItemsSyncRoot = new object();
public static Queue<WorkItem> workitems = new Queue<WorkItem>();

public void FillBuffer() {
  while(!done) {
    lock(WorkItemsSyncRoot) {
      if(workitems.Count < 30) {
        workitems.Enqueue(new WorkItem(/* load a file or something */ ));
      }
    }
  }
}

The worker thread's will have difficulty to obtain anything from the queue since its constantly being locked by the filling thread. Adding a Sleep() (outside the lock) could significantly speed up other threads:

public void FillBuffer() {
  while(!done) {
    lock(WorkItemsSyncRoot) {
      if(workitems.Count < 30) {
        workitems.Enqueue(new WorkItem(/* load a file or something */ ));
      }
    }
    System.Threading.Thread.Sleep(50);
  }
}

Hooking up a profiler could in some cases have the same effect as the sleep function.

I'm not sure if I've given representative examples (it's quite hard to come up with something simple) but I guess the point is clear, putting sleep() in the correct place can help improve the flow of other threads.

---------- Edit after Update7 -------------

I'd remove that LoopDataRefresh() thread altogether. Rather put a timer in your window with an interval of at least 20 (which would be 50 frames a second if none were skipped):

private void tim_Tick(object sender, EventArgs e) {
  tim.Enabled = false; // skip frames that come while we're still drawing
  if(IsDisposed) {
    tim.Tick -= tim_Tick;
    return;
  }

  // Your code follows, I've tried to optimize it here and there, but no guarantee that it compiles or works, not tested at all

  if(signalNewFFT && PanelFFT.Visible) {
    signalNewFFT = false;

    #region FFT
    bool newRange = false;
    if(graphFFT.MaxY != d.fftRangeYMax) {
      graphFFT.MaxY = d.fftRangeYMax;
      newRange = true;
    }
    if(graphFFT.MinY != d.fftRangeYMin) {
      graphFFT.MinY = d.fftRangeYMin;
      newRange = true;
    }

    int tempLength = 0;
    short[] tempData;

    int i = 0;

    lock(d.fftDataLock) {
      tempLength = d.fftLength;
      tempData = (short[])d.fftData.Clone();
    }

    graphFFT.SetLine("FFT", tempData);

    if(newRange) graphFFT.RefreshGraphComplete();
    else if(PanelFFT.Visible) graphFFT.RefreshGraph();
    #endregion

    // End of your code

    tim.Enabled = true; // Drawing is done, allow new frames to come in.
  }
}

Here's the optimized SetLine() which no longer takes a list of points but the raw data:

public class GraphFFT {
    public void SetLine(String lineTitle, short[] values) {
      IPointListEdit ip = zgcGraph.GraphPane.CurveList[lineTitle].Points as IPointListEdit;
      int tmp = Math.Min(ip.Count, values.Length);
      int i = 0;
      peakX = values.Length;

      while(i < tmp) {
        if(values[i] > peakY) peakY = values[i];
        ip[i].X = i;
        ip[i].Y = values[i];
        i++;
      }
      while(ip.Count < values.Count) {
        if(values[i] > peakY) peakY = values[i];
        ip.Add(i, values[i]);
        i++;
      }
      while(values.Count > ip.Count) {
        ip.RemoveAt(ip.Count - 1);
      }
    }
  }

I hope you get that working, as I commented before, I hav'nt got the chance to compile or check it so there could be some bugs there. There's more to be optimized there, but the optimizations should be marginal compared to the boost of skipping frames and only collecting data when we have the time to actually draw the frame before the next one comes in.

If you closely study the graphs in the video at iZotope , you'll notice that they too are skipping frames, and sometimes are a bit jumpy. That's not bad at all, it's a trade-off you make between the processing power of the foreground thread and the background workers.

If you really want the drawing to be done in a separate thread, you'll have to draw the graph to a bitmap (calling Draw() and passing the bitmaps device context). Then pass the bitmap on to the main thread and have it update. That way you do lose the convenience of the designer and property grid in your IDE, but you can make use of otherwise vacant processor cores.

---------- edit answer to remarks --------

Yes there is a way to tell what calls what. Look at your first screen-shot, you have selected the "call tree" graph. Each next line jumps in a bit (it's a tree-view, not just a list!). In a call-graph, each tree-node represents a method that has been called by its parent tree-node (method).

In the first image, WndProc was called about 1800 times, it handled 872 messages of which 62 triggered ZedGraphControl.OnPaint() (which in turn accounts for 53% of the main threads total time).

The reason you don't see another rootnode, is because the 3rd dropdown box has selected "[604] Mian Thread" which I didn't notice before.

As for the more fluent graphs, I have 2nd thoughts on that now after looking more closely to the screen-shots. The main thread has clearly received more (double) update messages, and the CPU still has some headroom.

It looks like the threads are out-of-sync and in-sync at different times, where the update messages arrive just too late (when WndProc was done and went to sleep for a while), and then suddenly in time for a while. I'm not very familiar with Ants, but does it have a side-by side thread timeline including sleep time? You should be able to see what's going on in such a view. Microsofts threads view tool would come in handy for this: 在此输入图像描述

Luaan posted the solution in the comments above, it's the system wide timer resolution. Default resolution is 15.6 ms, the profiler sets the resolution to 1ms.

I had the exact same problem, very slow execution that would speed up when the profiler was opened. The problem went away on my PC but popped back up on other PCs seemingly at random. We also noticed the problem disappeared when running a Join Me window in Chrome.

My application transmits a file over a CAN bus. The app loads a CAN message with eight bytes of data, transmits it and waits for an acknowledgment. With the timer set to 15.6ms each round trip took exactly 15.6ms and the entire file transfer would take about 14 minutes. With the timer set to 1ms round trip time varied but would be as low as 4ms and the entire transfer time would drop to less than two minutes.

You can verify your system timer resolution as well as find out which program increased the resolution by opening a command prompt as administrator and entering:

powercfg -energy duration 5

The output file will have the following in it somewhere:

Platform Timer Resolution:Platform Timer Resolution The default platform timer resolution is 15.6ms (15625000ns) and should be used whenever the system is idle. If the timer resolution is increased, processor power management technologies may not be effective. The timer resolution may be increased due to multimedia playback or graphical animations. Current Timer Resolution (100ns units) 10000 Maximum Timer Period (100ns units) 156001

My current resolution is 1 ms (10,000 units of 100nS) and is followed by a list of the programs that requested the increased resolution.

This information as well as more detail can be found here: https://randomascii.wordpress.com/2013/07/08/windows-timer-resolution-megawatts-wasted/

Here is some code to increase the timer resolution (originally posted as the answer to this question: how to set timer resolution from C# to 1 ms? ):

public static class WinApi
{
    /// <summary>TimeBeginPeriod(). See the Windows API documentation for details.</summary>

    [System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Interoperability", "CA1401:PInvokesShouldNotBeVisible"), System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Security", "CA2118:ReviewSuppressUnmanagedCodeSecurityUsage"), SuppressUnmanagedCodeSecurity]
    [DllImport("winmm.dll", EntryPoint = "timeBeginPeriod", SetLastError = true)]

    public static extern uint TimeBeginPeriod(uint uMilliseconds);

    /// <summary>TimeEndPeriod(). See the Windows API documentation for details.</summary>

    [System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Interoperability", "CA1401:PInvokesShouldNotBeVisible"), System.Diagnostics.CodeAnalysis.SuppressMessage("Microsoft.Security", "CA2118:ReviewSuppressUnmanagedCodeSecurityUsage"), SuppressUnmanagedCodeSecurity]
    [DllImport("winmm.dll", EntryPoint = "timeEndPeriod", SetLastError = true)]

    public static extern uint TimeEndPeriod(uint uMilliseconds);
}

Use it like this to increase resolution : WinApi.TimeBeginPeriod(1);

And like this to return to the default : WinApi.TimeEndPeriod(1);

The parameter passed to TimeEndPeriod() must match the parameter that was passed to TimeBeginPeriod().

When I have never heard or seen something similar; I'd recommend the common sense approach of commenting out sections of code/injecting returns at tops of functions until you find the logic that's producing the side effect. You know your code and likely have an educated guess where to start chopping. Else chop mostly all as a sanity test and start adding blocks back. I'm often amazed how fast one can find those seemingly impossible bugs to track. Once you find the related code, you will have more clues to solve your issue.

If you have a method which throws a lot of exceptions, it can run slowly in debug mode and fast in CPU Profiling mode.

As detailed here , debug performance can be improved by using the DebuggerNonUserCode attribute. For example:

[DebuggerNonUserCode]
public static bool IsArchive(string filename)
{
    bool result = false;
    try
    {
        //this calls an external library, which throws an exception if the file is not an archive
        result = ExternalLibrary.IsArchive(filename);
    }
    catch
    {

    }
    return result;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM