简体   繁体   中英

C# application CPU performance drastically slows down when program creates a high number of objects

I meet this weird performance issue:

  1. I have a C# application which creates millions of C# objects.

  2. In an unrelated part of the code, the application does a specific work which does not depend on the data allocated at step 1.

The CPU times seem to be correlated to the number of objects created at step 1.

I wrote a simple C# case which reproduces my issue. slowdown command is called with the number of millions of string objects created before the DoMyWork() method is called. As you can see, the same DoMyWork() method can take up to 3s if 200M of strings are instantiated.

  • Do I miss something in the language ?
  • Suppose the physical memory limit is not reached, is there a max number of objects that should not be reached otherwise CLR would slow down ?

I ran my test under Windows 10 on Intel Core i7-6700 and my program is a console release built in 32 bits mode (VS 2017 - fw 4.6.1):

slowdown 0
Allocating 40000 hashtables: 2 ms
Allocating 40000 hashtables: 4 ms
Allocating 40000 hashtables: 15 ms
Allocating 40000 hashtables: 2 ms
Allocating 40000 hashtables: 5 ms
Allocating 40000 hashtables: 5 ms
Allocating 40000 hashtables: 2 ms
Allocating 40000 hashtables: 18 ms
Allocating 40000 hashtables: 10 ms
Allocating 40000 hashtables: 19 ms

slowdown 0 uses ~30M

slowdown 200
Allocating 40000 hashtables: 392 ms
Allocating 40000 hashtables: 1120 ms
Allocating 40000 hashtables: 3067 ms
Allocating 40000 hashtables: 2 ms
Allocating 40000 hashtables: 31 ms
Allocating 40000 hashtables: 418 ms
Allocating 40000 hashtables: 15 ms
Allocating 40000 hashtables: 2 ms
Allocating 40000 hashtables: 18 ms
Allocating 40000 hashtables: 416 ms

slowdown 200 uses ~800M


using System;
using System.Diagnostics;
using System.Collections;

namespace SlowDown
{
  class Program
  {
    static string[] arr;

    static void CreateHugeStringArray(long size)
    {
      arr = new string[size * 1000000];
      for (int i = 0; i < arr.Length; i++) arr[i] = "";
    }


    static void DoMyWork()
    {
      int n = 40000;
      Console.Write("Allocating " + n + " hashtables: ");
      Hashtable[] aht = new Hashtable[n];

      for (int i = 0; i < n; i++)
      {
        aht[i] = new Hashtable();
      }
    }


    static void Main(string[] args)
    {
      if (0 == args.Length) return;
      CreateHugeStringArray(Convert.ToInt64(args[0]));

      for (int i = 0; i < 10 ; i++)
      {
        Stopwatch sw = Stopwatch.StartNew();
        DoMyWork();
        sw.Stop();
        Console.Write(sw.ElapsedMilliseconds + " ms\n");
      }
    }
  }
}

Likely Garbage Collector nasty stuff, which can freeze you main thread, even if it works mostly on a background thread, as mentionned here : Garbage Collector Thread

If you collect it, the time remains (in my case) around 90ms regardless of the size of the "unrelated" array.

The issue is caused by the Garbage Collector running at the same time as your DoMyWork . The sheer size of the array it needs to clean up 'interrupts' the real work.

To see the impact of the GC, add these lines before your StartNew call - so that the GC work occurs prior to the timing:

GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
GC.Collect();

The following code creates 10000 new string objects, forcing the garbage collection to run:

 string str = "";

 for (int i = 0; i < 10000; i++) str += i;

The performance of the garbage collector is proportional to

  • The number of objects which have been allocated
  • The total amount of memory in use

Your CreateHugeStringArray() allocates very large objects, increasing the total amount of memory in use. In extreme cases parts of this memory may be on disk (paged out) further slowing down the system.

The moral of your story is - don't allocate memory unless you need it.

Did not find the reason yet, but is seems that having a huge array in LOH slows down garbage collection significantly. However, if we create many smaller arrays to hold the same amount of data (which goes to Generation 2 instead of LOH), GC does not slow so much. It seems that array with 1kk string pointers occupies about 4 million bytes of memory. So in order to avoid getting to LOH, the array must occupy less than 85 kilobytes. This is about 50 times lesser. You may use old trick to split big array into many small arrays

    private static string[][] arrayTwoDimentional;

    private static int _arrayLength = 1000000;

    private static int _sizeFromExample = 200;

    static void CreateHugeStringArrayTwoDimentional()
    {
        // Make 50 times more smaller arrays
        arrayTwoDimentional = new string[_sizeFromExample * 50][];

        for (long i = 0; i < arrayTwoDimentional.Length; i++)
        {
            // Make array smaller 50 times
            arrayTwoDimentional[i] = new string[_arrayLength / 50];
            for (var index = 0; index < arrayTwoDimentional[i].Length; index++)
            {
                arrayTwoDimentional[i][index] = "";
            }
        }
    }

    static string GetByIndex(long index)
    {
        var arrayLenght = _arrayLength / 50;
        var firstIndex = index / arrayLenght;
        var secondIndex = index % arrayLenght;

        return arrayTwoDimentional[firstIndex][secondIndex];
    }

Proof that GC is bottleneck here

DoMyWork 内部 After replacing array layout

搬家后

In the example, array sizes are hard coded. There is a good example on Codeproject how you can calculate the size of type stored object, which will help to adjust the size of arrays: https://www.codeproject.com/Articles/129541/NET-memory-problem-with-uncontrolled-LOH-size-and

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM