简体   繁体   English

.NET进程的内存转储中存在大量无法解释的内存

[英]Large unexplained memory in the memory dump of a .NET process

I can't explain most of the memory used by a C# process. 我无法解释C#进程使用的大部分内存。 The total memory is 10 GB, but the total reachable and unreachable objects altogether total 2.5 GB. 总内存为10 GB,但总可达和不可达对象总共为2.5 GB。 I wonder what these 7.5 GB could be? 我想知道7.5 GB这些是什么?

I'm looking for the most likely explanations or a method to find out what this memory can be. 我正在寻找最可能的解释或方法来找出这个记忆是什么。

Here is the precise situation. 这是精确的情况。 The process is .NET 4.5.1. 该过程是.NET 4.5.1。 It downloads pages from internet and process them with machine learning. 它从互联网下载页面并通过机器学习处理它们。 The memory is almost entirely in the Managed Heap as shown by VMMap. 内存几乎完全在Managed Heap中,如VMMap所示。 This seems to rule out unmanaged memory leak. 这似乎排除了非托管内存泄漏。 在此输入图像描述

The process has been running for days and the memory slowly grew. 这个过程已经运行了好几天,内存慢慢增长。 At some point, the memory is 11 GB. 在某些时候,内存是11 GB。 I stop everything running in the process. 我停止在这个过程中运行的一切。 I run garbage collections including large object heap compaction several times (with one minute of interval): 我运行垃圾收集包括大型对象堆压缩几次(间隔一分钟):

GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
GC.Collect();

The memory goes down to 10 GB. 内存降至10 GB。 Then I create the dump: 然后我创建转储:

procdump -ma psid procdump -ma psid

The dump is 10 GB as expected. 转储是预期的10 GB。

I open the dump with .NET memory profiler (version 5.6). 我用.NET内存分析器 (5.6版)打开转储。 The dump shows a total of 2.2 GB reachable objects and 0.3 GB unreachable objects. 转储显示总共2.2 GB可访问对象和0.3 GB无法访问的对象。 What could explain the remaining 7.5 GB ? 有什么可以解释剩余的7.5 GB?

Possible explanations I've been thinking of : 我一直在想的可能的解释:

  • the LOH does not really get fully compacted LOH并没有真正得到完全压缩
  • some memory is used beyond the objects displayed by the profiler 在探查器显示的对象之外使用一些内存

After investigation, the problem happens to be heap fragmentation because of pinned buffers . 经过调查, 由于固定缓冲区 ,问题恰好是堆碎片 I'll explain how to investigate and what pinned buffers are. 我将解释如何调查以及固定缓冲区是什么。

All profilers I've used agreed to say most of the heap is free. 我使用的所有分析器都同意说大部分堆都是免费的。 Now I needed to look at fragmentation. 现在我需要看一下碎片。 I can do it with WinDbg for example: 我可以用WinDbg做到这一点:

!dumpheap -stat

Then I looked at the "Fragmented blocks larger than..." section. 然后我查看了“大于......的碎片块”部分。 WinDbg says objects lie between the free blocks making compaction impossible. WinDbg表示,对象位于自由块之间,无法实现压缩。 Then I looked at what is holding these objects and if they are pinned, here for example object at address 0000000bfaf93b80: 然后我查看了持有这些对象的内容以及它们是否被固定,例如地址为0000000bfaf93b80的对象:

!gcroot 0000000bfaf93b80

It displays the reference graph: 它显示参考图:

00000004082945e0 (async pinned handle)
-> 0000000535b3a3e0 System.Threading.OverlappedData
-> 00000006f5266d38 System.Threading.IOCompletionCallback
-> 0000000b35402220 System.Net.Sockets.SocketAsyncEventArgs
-> 0000000bf578c850 System.Net.Sockets.Socket
-> 0000000bf578c900 System.Net.SocketAddress
-> 0000000bfaf93b80 System.Byte[]

00000004082e2148 (pinned handle)
-> 0000000bfaf93b80 System.Byte[]

The last two lines tell you the object is pinned. 最后两行告诉您对象被固定。

Pinned objects are buffers than can't be moved because their address is shared with non-managed code. 固定对象是缓冲区而不能移动,因为它们的地址与非托管代码共享。 Here you can guess it is the system TCP layer. 在这里你可以猜到它是系统TCP层。 When managed code needs to send the address of a buffer to external code, it needs to "pin" the buffer so that the address remains valid: the GC cannot move it. 当托管代码需要将缓冲区的地址发送到外部代码时,它需要“固定”缓冲区以使地址保持有效:GC无法移动它。

These buffers, while being a very small part of the memory make compaction impossible and thus cause large memory "leak", even if it is not exactly a leak, more a fragmentation problem. 这些缓冲区虽然是内存的一小部分,但却无法实现压缩,从而导致大量内存“泄漏”,即使它不是完全泄漏,也更多是碎片问题。 This can happen on the LOH or on generational heaps just the same. 这可能发生在LOH或代际堆上。 Now the question is: what is causing these pinned objects to live forever: find the root cause of the leak that causes the fragmentation. 现在的问题是:是什么导致这些固定对象永远存在:找到导致碎片的泄漏的根本原因。

You can read similar questions here: 你可以在这里阅读类似的问题:

Note: the root cause was in a third party library AerospikeClient using the .NET async Socket API that is known for pinning the buffers sent to it . 注意:根本原因在于使用.NET异步套接字API的第三方库AerospikeClient ,该API 已知用于固定发送给它的缓冲区 While AerospikeClient properly used a buffer pool, the buffer pool was re-created when re-creating their client. 虽然AerospikeClient正确使用了缓冲池,但在重新创建客户端时重新创建了缓冲池。 Since we re-created their client every hour instead of creating one forever, the buffer pool was re-created, causing a growing number of pinned buffers, in turn causing unlimited fragmentation. 由于我们每小时重新创建客户端而不是永久创建客户端,因此重新创建缓冲池,导致越来越多的固定缓冲区,从而导致无限制的碎片。 What remains unclear is why old buffers are never unpinned when transmission is over or at least when their client is disposed. 仍然不清楚的是,为什么旧​​的缓冲区在传输结束时或者至少在其客户端处置时永远不会被取消固定。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM