简体   繁体   English

大量使用LOH会导致严重的性能问题

[英]Extensive use of LOH causes significant performance issue

We have a Web Service using WebApi 2, .NET 4.5 on Server 2012. We were seeing occasional latency increases by 10-30ms with no good reason. 我们在Server 2012上有一个使用WebApi 2 .NET 4.5的Web服务。我们发现偶尔的等待时间会增加10-30ms,这没有充分的理由。 We were able to track down the problematic piece of code to LOH and GC. 我们能够找到有问题的代码到LOH和GC。

There is some text which we convert to its UTF8 byte representation (actually, the serialization library we use does that). 我们将某些文本转换为其UTF8字节表示形式(实际上,我们使用的序列化库可以做到这一点)。 As long as the text is shorter than 85000 bytes, latency is stable and short: ~0.2 ms on average and at 99%. 只要文本少于85000字节,延迟就稳定且短:平均〜0.2 ms,且为99%。 As soon as the 85000 boundary is crossed, average latency increases to ~1ms while the 99% jumps to 16-20ms. 越过85000边界,平均等待时间将增加到〜1ms,而99%的时间会跳到16-20ms。 Profiler shows that most of the time is spent in GC. Profiler显示大部分时间都花在GC上。 To be certain, if I put GC.Collect between iterations, the measured latency goes back to 0.2ms. 可以肯定的是,如果我在两次迭代之间放置GC.Collect,则测得的延迟将回到0.2ms。

I have two questions: 我有两个问题:

  1. Where does the latency come from? 延迟来自何处? As far as I understand the LOH isn't compacted. 据我了解,LOH尚未压缩。 SOH is being compacted, but doesn't show the latency. SOH正在压缩,但未显示延迟。
  2. Is there a practical way to work around this? 是否有解决此问题的实用方法? Note that I can't control the size of the data and make it smaller. 请注意,我无法控制数据的大小并使它变小。

-- -

public void PerfTestMeasureGetBytes()
{
    var text = File.ReadAllText(@"C:\Temp\ContactsModelsInferences.txt");
    var smallText = text.Substring(0, 85000 + 100);
    int count = 1000;
    List<double> latencies = new List<double>(count);
    for (int i = 0; i < count; i++)
    {
        Stopwatch sw = new Stopwatch();
        sw.Start();
        var bytes = Encoding.UTF8.GetBytes(smallText);
        sw.Stop();
        latencies.Add(sw.Elapsed.TotalMilliseconds);

        //GC.Collect(2, GCCollectionMode.Default, true);
    }

    latencies.Sort();
    Console.WriteLine("Average: {0}", latencies.Average());
    Console.WriteLine("99%: {0}", latencies[(int)(latencies.Count * 0.99)]);
}

The performance problems usually come from two areas: allocation and fragmentation. 性能问题通常来自两个方面:分配和碎片。

Allocation 分配

The runtime guarantees clean memory so spends cycles cleaning it. 运行时保证干净的内存,因此需要花费时间来清理内存。 When you allocate a large object, that's a lot of memory and starts to add milliseconds to a single allocation (when lets be honest, simple allocation in .NET is actually very fast, so we usually never care about this). 当您分配一个大对象时,这会占用大量内存,并开始向单个分配添加毫秒数(说实话,.NET中的简单分配实际上非常快,因此我们通常不会在乎这一点)。

Fragmentation occurs when LOH objects are allocated then reclaimed. 当分配LOH对象然后对其进行回收时,就会发生碎片。 Until recently, the GC could not reorganise the memory to remove these old object "gaps", and thus could only fit the next object in that gap if it was the same size or smaller. 直到最近,GC仍无法重新组织内存以消除这些旧对象“间隙”,因此,如果大小相同或更小,则只能将下一个对象放入该间隙中。 Recently, the GC has been given the ability to compact the LOH, which removes this issue, but costs time during compaction. 最近,GC已具有压缩LOH的功能,这消除了此问题,但在压缩过程中花费时间。

My guess in your case is you are suffering from both issues and triggering GC runs, but it depends on how often your code is attempting to allocate items in the LOH. 在您的情况下,我的猜测是您同时遇到这两个问题并触发GC运行,但这取决于您的代码尝试在LOH中分配项目的频率。 If you are doing lots of allocations, try the object pooling route. 如果您要进行大量分配,请尝试对象池路由。 If you cannot control a pool effectively (lumpy object lifetimes or disparate usage patterns), try chunking the data you are working against to avoid it completely. 如果您不能有效地控制池(粗大的对象生存期或不同的使用模式),请尝试对要处理的数据进行分块,以完全避免使用它。


Your Options 您的选择

I've encountered two approaches to the LOH: 我在LOH中遇到了两种方法:

  • Avoid it. 避开它。
  • Use it, but realise you are using it and manage it explicitly. 使用它,但要意识到您正在使用它并对其进行显式管理。

Avoid it 避开它

This involves chunking your large object (usually an array of some sort) into, well, chunks that each fall under the LOH barrier. 这涉及将您的大对象(通常是某种数组)分块为每个都属于LOH障碍的块。 We do this when serialising large object streams. 我们在序列化大型对象流时执行此操作。 Works well, but an implementation would be specific to your environment so I'm hesitant to provide a coded example. 效果很好,但是实现将特定于您的环境,因此我不愿提供一个编码示例。

Use it 用它

A simple way to tackle both allocation and fragmentation is long-lived objects. 解决分配问题和碎片问题的一种简单方法是寿命长的对象。 Explicitly make an empty array (or arrays) of a large size to accommodate your large object, and don't get rid of it (or them). 显式地制作一个较大的空数组(或多个数组)以容纳您的大对象,并且不要摆脱它(或它们)。 Leave it around and re-use it like an object pool. 保留它,然后像对象池一样重复使用它。 You pay for this allocation, but can do this either on first use or during application idle time, but you pay less for re-allocation (because you aren't re-allocating) and lessen fragmentation issues because you aren't constantly asking to allocate stuff and you aren't reclaiming items (which causes the gaps in the first place). 您可以为此分配付费,但是可以在首次使用时或在应用程序空闲时间执行此操作,但是您为重新分配(因为您不进行重新分配)而花的钱更少,并且可以减少碎片问题,因为您并不是一直在要求分配物料,您就不会回收物料(这首先会导致缺口)。

That said, a halfway house may be in order. 话虽如此,半途而废的房子也许井井有条。 Reserve a section of memory up-front for an object pool. 预先为对象池保留一部分内存。 Done early, these allocations should be contiguous in memory so you won't get any gaps, and leave the tail end of the available memory for uncontrolled items. 尽早完成这些分配,它们在内存中应该是连续的,这样您就不会有任何差距,并且可以将可用内存的末端留给不受控制的项目。 Do beware though that this obviously has an impact on the working set of your application - an object pool takes space regardless of it being used or not. 请注意,尽管这显然会影响应用程序的工作集-无论对象池是否被使用,它都会占用空间。


Resources 资源资源

The LOH is covered a lot out in the web, but pay attention to the date of the resource. 网络上大量报道了LOH,但要注意资源的日期。 In the latest .NET versions the LOH has received some love, and has improved. 在最新的.NET版本中,LOH受到了一些欢迎,并且有所改进。 That said, if you are on an older version I think the resources on the net are fairly accurate as the LOH never really received any serious updates in a long time between inception and .NET 4.5 (ish). 就是说,如果您使用的是较旧的版本,我认为网络上的资源是相当准确的,因为LOH在启动和.NET 4.5(ish)之间的很长时间内从未真正收到任何严重的更新。

For example, there is this article from 2008 http://msdn.microsoft.com/en-us/magazine/cc534993.aspx 例如,有2008年的这篇文章http://msdn.microsoft.com/zh-cn/magazine/cc534993.aspx

And a summary of improvements in .NET 4.5: http://blogs.msdn.com/b/dotnet/archive/2011/10/04/large-object-heap-improvements-in-net-4-5.aspx 以及.NET 4.5中的改进摘要: http : //blogs.msdn.com/b/dotnet/archive/2011/10/04/large-object-heap-improvements-in-net-4-5.aspx

In addition to the following, make sure that you're using the server garbage collector . 除了以下内容之外,请确保您正在使用服务器垃圾收集器 That doesn't affect how the LOH is used, but my experience is that it does significantly reduce the amount of time spent in GC. 这不会影响LOH的使用方式,但是我的经验是,它确实可以显着减少GC花费的时间。

The best work around I found for avoiding large object heap problems is to create a persistent buffer and re-use it. 我发现为避免大对象堆问题,最好的解决方法是创建一个持久缓冲区并重新使用它。 So rather than allocating a new byte array with every call to Encoding.GetBytes , pass the byte array to the method. 因此,与其在每次Encoding.GetBytes调用时分配一个新的字节数组,都不将字节数组传递给方法。

In this case, use the GetBytes overload that takes a byte array. 在这种情况下,请使用需要字节数组的GetBytes重载 Allocate an array that's large enough to hold the bytes for your longest expected string, and keep it around. 分配足够大的数组以容纳您期望的最长字符串的字节,并将其保留。 For example: 例如:

// allocate buffer at class scope
private byte[] _theBuffer = new byte[1024*1024];

public void PerfTestMeasureGetBytes()
{
    // ...
    for (...)
    {
        var sw = Stopwatch.StartNew();
        var numberOfBytes = Encoding.UTF8.GetBytes(smallText, 0, smallText.Length, _theBuffer, 0);
        sw.Stop();
        // ...
    }

The only problem here is that you have to make sure your buffer is large enough to hold the largest string. 唯一的问题是,您必须确保缓冲区足够大以容纳最大的字符串。 What I've done in the past is to allocate the buffer to the largest size I expect, but then check to make sure it's large enough whenever I go to use it. 我过去所做的就是将缓冲区分配给我期望的最大大小,但是每当我要使用它时,都要检查以确保它足够大。 If it's not large enough, then re-allocate it. 如果不够大,请重新分配它。 How you do that depends on how rigorous you want to be. 您的操作方式取决于您想要的严格程度。 When working with primarily Western European text, I'd just double the string length. 当主要使用西欧文字时,我会将字符串长度加倍。 For example: 例如:

string textToConvert = ...
if (_theBuffer.Length < 2*textToConvert.Length)
{
    // reallocate the buffer
    _theBuffer = new byte[2*textToConvert.Length];
}

Another way to do it is to just try the GetString , and reallocate on failure. 另一种方法是尝试使用GetString ,并在失败时重新分配。 Then retry. 然后重试。 For example: 例如:

while (!good)
{
    try
    {
        numberOfBytes = Encoding.UTF8.GetString(theString, ....);
        good = true;
    }
    catch (ArgumentException)
    {
        // buffer isn't big enough. Find out how much I really need
        var bytesNeeded = Encoding.UTF8.GetByteCount(theString);
        // and reallocate the buffer
        _theBuffer = new byte[bytesNeeded];
    }
}

If you make the buffer's initial size large enough to accommodate the largest string you expect, then you probably won't get that exception very often. 如果使缓冲区的初始大小足够大以容纳您期望的最大字符串,那么您可能不会经常遇到该异常。 Which means that the number of times you have to reallocate the buffer will be very small. 这意味着您必须重新分配缓冲区的次数将非常少。 You could, of course, add some padding to the bytesNeeded so that you allocate more , in case you have some other outliers. 当然,您可以在bytesNeeded添加一些填充,以便在有其他异常值的情况下分配更多的空间

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM