简体   繁体   English

.net框架4中的高内存问题,但在框架4.5中没有

[英]High Memory issues in .net framework 4 but not in framework 4.5

I have a following piece of code (.net 4) that is consuming a lot of memory: 我有一段消耗大量内存的代码(.net 4):

struct Data
{
    private readonly List<Dictionary<string,string>> _list;

    public Data(List<Dictionary<string,string>> List)
    {
        _list = List;
    }

    public void DoWork()
    {
        int num = 0;
        foreach (Dictionary<string, string> d in _list)
        {
            foreach (KeyValuePair<string, string> kvp in d)
                num += Convert.ToInt32(kvp.Value);
        }

        Console.Write(num);

        //_list = null;
    }
}

class Test1
{
    BlockingCollection<Data> collection = new BlockingCollection<Data>(10);
    Thread th;

    public Test1()
    {
        th = new Thread(Work);
        th.Start();
    }

    public void Read()
    {
        List<Dictionary<string, string>> l = new List<Dictionary<string, string>>();
        Random r = new Random();

        for (int i=0; i<100000; i++)
        {
            Dictionary<string, string> d = new Dictionary<string,string>();
            d["1"]  = r.Next().ToString();
            d["2"]  = r.Next().ToString();
            d["3"]  = r.Next().ToString();
            d["4"]  = r.Next().ToString();

            l.Add(d);
        }

        collection.Add(new Data(l));
    }

    private void Work()
    {
        while (true)
        {
            collection.Take().DoWork();
        }
    }
}

class Program
{
    Test1 t = new Test1();
    static void Main(string[] args)
    {
        Program p = new Program();
        for (int i = 0; i < 1000; i++)
        {
            p.t.Read();
        }
    }
}

The size of blocking collection is 10. In my knowledge, gc should collect references in 'Data' struct as soon its DoWork method is complete. 阻塞集合的大小是10.据我所知,gc应该在其DoWork方法完成后收集'Data'结构中的引用。 However, the memory keeps on increasing at a rapid rate until the program crashes or it come down on its own and this is happening more often on low end machines (on some machines memory does not increase).Further, when I add the following line "_list = null;" 但是,内存会一直保持快速增长,直到程序崩溃或自身崩溃为止,这种情况在低端机器上更常发生(在某些机器上内存不会增加)。此外,当我添加以下行时“_list = null;” at the end of DoWork method and convert 'Data' into class (from struct), memory does not increase. 在DoWork方法结束并将'Data'转换为类(从struct),内存不会增加。

What could be happening here. 这可能会发生什么。 I need some suggestions here. 我在这里需要一些建议。

Update: the issue is occuring on machines with .net framework 4 installed (4.5 not installed) 更新:安装.net框架4的计算机上出现此问题( 安装4.5)

I've tried on my computer here are the result: 我在电脑上试过这里的结果是:

  1. With Data as class and without _list = null at the end of DoWork -> memory increases 使用Data作为类并且在DoWork结束时没有_list = null - >内存增加
  2. With Data as struct and without _list = null at the end of DoWork -> memory increases 使用Data as struct并且在DoWork结束时没有_list = null - >内存增加
  3. With Data as class and with _list = null at the end of DoWork -> memory stabilizes at 150MB 使用Data作为类并在DoWork结束时使用_list = null - >内存稳定在150MB
  4. With Data as struct and with _list = null at the end of DoWork -> memory increases 使用Data as struct并在DoWork结束时使用_list = null - >内存增加

In the cases where _list = null is commented, it is not a surprise to see this result. 在注释_list = null的情况下,看到这个结果并不奇怪。 Because there is still a reference to the _list. 因为仍然有对_list的引用。 Even if DoWork is never called again, the GC can not know it. 即使再也没有调用DoWork ,GC也无法知道。

In the third case, the garbage collector have the behavior we expect it to have. 在第三种情况下,垃圾收集器具有我们期望它具有的行为。

For the fourth case, the BlockingCollection stores the Data when you pass it as argument of in collection.Add(new Data(l)); 对于第四种情况,BlockingCollection在您将其作为collection.Add(new Data(l));参数传递时存储Data .Add collection.Add(new Data(l)); , but then what is done? ,但接下来做了什么?

  1. The a new struct data is created with data._list equals to l (ie as the type List is a class (reference type), data._list equals in the struct Data to the address of l ). 使用data._list等于l创建新的结构data (即,由于类型List是类(引用类型), data._list在struct Data等于l )的地址。
  2. Then you pass it as argument in collection.Add(new Data(l)); 然后将它作为参数传递给collection.Add(new Data(l)); then it creates a copy of the data created in 1. Then the address of l is copied. 然后它创建1中创建的data的副本。然后复制l的地址。
  3. The blocking collection stores your Data elements in an array. 阻塞集合将您的Data元素存储在一个数组中。
  4. When DoWork executes _list = null , it removes the reference to the problematic List only in the current struct, not in all the copied version that are stored in the BlockingCollection . DoWork执行_list = null ,它仅删除当前结构中对有问题的List的引用,而不是存储在BlockingCollection中的所有复制版本。
  5. Then, you have the problem unless you clear the BlockingCollection . 然后,除非您清除BlockingCollection否则您遇到了问题。

How to find the problem? 如何找到问题?

To find memory leak problem, I suggest you to use SOS ( http://msdn.microsoft.com/en-us/library/bb190764.aspx ). 要查找内存泄漏问题,建议您使用SOS( http://msdn.microsoft.com/en-us/library/bb190764.aspx )。

Here, I present how I have found the issue. 在这里,我介绍我是如何找到这个问题的。 As it is a issue that imply not only the heap but also the stack, using heap analysis (as here) is not the best way to find the source of the problem. 因为这是一个不仅意味着堆而且意味着堆栈的问题,使用堆分析(如此处)并不是找到问题根源的最佳方法。

1 Put a breakpoint on _list = null (because this line should work !!!) 1_list = null上放一个断点(因为这行应该工作!!!)

2 Execute the program 2执行程序

3 When the breakpoint is reached, load the SOS Debugging Tool (Write ".load sos" in the Immediate Window) 3到达断点时,加载SOS调试工具(在立即窗口中写入“.sos”)

4 The problem seems to come from the private List> _list that is note disposed correctly. 4问题似乎来自正确处理的private List> _list So we'll try to find the instances of the type. 所以我们将尝试找到该类型的实例。 Type !DumpHeap -stat -type List in the Immediate Window. 在立即窗口中键入!DumpHeap -stat -type List Result: 结果:

total 0 objects
Statistics:
      MT    Count    TotalSize Class Name
0570ffdc        1           24 System.Collections.Generic.List1[[System.Threading.CancellationTokenRegistration, mscorlib]]
04f63e50        1           24 System.Collections.Generic.List1[[System.Security.Policy.StrongName, mscorlib]]
00202800        2           48 System.Collections.Generic.List1[[System.Collections.Generic.Dictionary2[[System.String, mscorlib],[System.String, mscorlib]], mscorlib]]
Total 4 objects

The problematic type is the last one List<Dictionary<...>> . 有问题的类型是最后一个List<Dictionary<...>> There are 2 instances and the MethodTable (a kind of reference of the type) is 00202800 . 有2个实例,MethodTable(一种类型的引用)是00202800

5 To get the references, type !DumpHeap -mt 00202800 . 5要获取引用,请键入!DumpHeap -mt 00202800 Result: 结果:

 Address       MT     Size
02618a9c 00202800       24     
0733880c 00202800       24     
total 0 objects
Statistics:
      MT    Count    TotalSize Class Name
00202800        2           48 System.Collections.Generic.List1[[System.Collections.Generic.Dictionary2[[System.String, mscorlib],[System.String, mscorlib]], mscorlib]]
Total 2 objects

The two instances are shown, with their addresses: 02618a9c and 0733880c 显示了两个实例,其地址为: 02618a9c0733880c

6 To find how they are references: Type !GCRoot 02618a9c (for the first instance) or !GCRoot 0733880c (for the second). 6要查找它们的引用方式:键入!GCRoot 02618a9c (第一个实例)或!GCRoot 0733880c (第二个)。 Result (I have not copied all the result but kept an important part): 结果(我没有复制所有结果但保留了一个重要部分):

ESP:3bef9c:Root:  0261874c(ConsoleApplication1.Test1)->
  0261875c(System.Collections.Concurrent.BlockingCollection1[[ConsoleApplication1.Data, ConsoleApplication1]])->
  02618784(System.Collections.Concurrent.ConcurrentQueue1[[ConsoleApplication1.Data, ConsoleApplication1]])->
  02618798(System.Collections.Concurrent.ConcurrentQueue1+Segment[[ConsoleApplication1.Data, ConsoleApplication1]])->
  026187bc(ConsoleApplication1.Data[])->
  02618a9c(System.Collections.Generic.List1[[System.Collections.Generic.Dictionary2[[System.String, mscorlib],[System.String, mscorlib]], mscorlib]])

for the first instance, and: 对于第一个实例,并且:

Scan Thread 5216 OSTHread 1460
ESP:3bf0b0:Root:  0733880c(System.Collections.Generic.List1[[System.Collections.Generic.Dictionary2[[System.String, mscorlib],[System.String, mscorlib]], mscorlib]])
Scan Thread 4960 OSTHread 1360
Scan Thread 6044 OSTHread 179c

for the second one (when the analyzed object has not deeper root, I think it means it has reference in the stack). 对于第二个(当分析的对象没有更深的根时,我认为这意味着它在堆栈中有引用)。

Looking at 026187bc(ConsoleApplication1.Data[]) should be a good way to understand what happen, because we finally see our Data type. 查看026187bc(ConsoleApplication1.Data[])应该是了解发生情况的好方法,因为我们最终会看到我们的Data类型。

7 To display the content of object, use !DumpObj 026187bc , or in this case, as it is an array, use !DumpArray -details 026187bc . 7要显示对象的内容,请使用!DumpObj 026187bc ,或者在这种情况下,因为它是一个数组,请使用!DumpArray -details 026187bc Result (partial): 结果(部分):

Name:        ConsoleApplication1.Data[]
MethodTable: 00214f30
EEClass:     00214ea8
Size:        140(0x8c) bytes
Array:       Rank 1, Number of elements 32, Type VALUETYPE
Element Methodtable: 00214670
[0] 026187c4
    Name:        ConsoleApplication1.Data
    MethodTable: 00214670
    EEClass:     00211ac4
    Size:        12(0xc) bytes
    File:        D:\Development Projects\Centive Solutions\SVN\trunk\CentiveSolutions.Renderers\ConsoleApplication1\bin\Debug\ConsoleApplication1.exe
    Fields:
              MT    Field   Offset                 Type VT     Attr    Value Name
        00202800  4000001        0     ...lib]], mscorlib]]      0     instance     02618a9c     _list
[1] 026187c8
    Name:        ConsoleApplication1.Data
    MethodTable: 00214670
    EEClass:     00211ac4
    Size:        12(0xc) bytes
    File:        D:\Development Projects\Centive Solutions\SVN\trunk\CentiveSolutions.Renderers\ConsoleApplication1\bin\Debug\ConsoleApplication1.exe
    Fields:
              MT    Field   Offset                 Type VT     Attr    Value Name
        00202800  4000001        0     ...lib]], mscorlib]]      0     instance     6d50950800000000     _list
[2] 026187cc
    Name:        ConsoleApplication1.Data
    MethodTable: 00214670
    EEClass:     00211ac4
    Size:        12(0xc) bytes
    File:        D:\Development Projects\Centive Solutions\SVN\trunk\CentiveSolutions.Renderers\ConsoleApplication1\bin\Debug\ConsoleApplication1.exe
    Fields:
              MT    Field   Offset                 Type VT     Attr    Value Name
        00202800  4000001        0     ...lib]], mscorlib]]      0     instance     6d50950800000000     _list

Here we have the value of the _list attribute for the 3 first elements of the array: 02618a9c , 6d50950800000000 , 6d50950800000000 . 在这里,我们具有的值_list用于阵列的3种第一元素属性: 02618a9c6d509508000000006d50950800000000 I suspect 6d50950800000000 to be the "null pointer". 我怀疑6d50950800000000是“空指针”。

Here we have the answer to your question: There is an array (referenced by the blocking collection (see 6.)) that contains directly the address of the _list we want the garbage collector to finalize. 在这里,我们可以回答您的问题:有一个数组(由阻塞集合引用(参见6.)),它直接包含我们希望垃圾收集器完成的_list的地址。

8 To be sure it is not changing when the line _line = null is executed, executes the line. 8为了确保在执行行_line = null时它没有改变,执行该行。

Note 注意

As I've mentioned, using DumpHeap is not well suited for the current task implying value types. 正如我所提到的,使用DumpHeap并不适合当前暗示值类型的任务。 Why? 为什么? Because value types are not in the heap but on the stack. 因为值类型不在堆中,而是在堆栈中。 Seeing this is very simple: try !DumpHeap -stat -type ConsoleApplication1.Data on the breakpoint. 看到这一点非常简单:尝试!DumpHeap -stat -type ConsoleApplication1.Data断点上的!DumpHeap -stat -type ConsoleApplication1.Data Result: 结果:

total 0 objects
Statistics:
      MT    Count    TotalSize Class Name
00214c00        1           20 System.Collections.Concurrent.ConcurrentQueue`1[[ConsoleApplication1.Data, ConsoleApplication1]]
00214e24        1           36 System.Collections.Concurrent.ConcurrentQueue`1+Segment[[ConsoleApplication1.Data, ConsoleApplication1]]
00214920        1           40 System.Collections.Concurrent.BlockingCollection`1[[ConsoleApplication1.Data, ConsoleApplication1]]
00214f30        1          140 ConsoleApplication1.Data[]
Total 4 objects

There is an array of Data but no Data . 有一个Data数组但没有Data Because DumpHeap only analyses the heap. 因为DumpHeap只分析堆。 Then !DumpArray -details 026187bc , the pointer is still here with the same value. 然后!DumpArray -details 026187bc ,指针仍然在这里具有相同的值。 And if you compare the roots of the two instances we have found before (with !GCRoot ) before executing the line and after, there will be only line removed. 如果你比较我们之前找到的两个实例的根(使用!GCRoot ),然后执行该行,之后只会删除一行。 Indeed, the refence to the list has only be removed from 1 copy of the value type Data . 实际上,只能从值类型Data 1个副本中删除列表的refence。

If you read Stephen Toub's explanation of how ConcurrentQueue works, the behavior makes sense. 如果您阅读了Stephen Toub关于ConcurrentQueue如何工作的解释 ,那么这种行为是有道理的。 BlockingCollection uses ConcurrentQueue by default, which stores its elements in linked lists of 32-element segments. BlockingCollection默认使用ConcurrentQueue ,它将其元素存储在32个元素段的链接列表中。

For the purposes of concurrent access, elements in the linked list are never overwritten, so they don't get unreferenced until the last of a whole segment of 32 is consumed. 出于并发访问的目的,链表中的元素永远不会被覆盖,因此在消耗整个32段的最后一个之前,它们不会被取消引用。 Since you have a bounded capacity of 10 elements, let's say that you have produced 41 elements and consumed 31. That means you will have one segment of 31 consumed element plus one queued element, and another segment with the remaining 9 elements. 由于你有10个元素的有界容量,假设你已经生成了41个元素并消耗了31个。这意味着你将有一个31个消耗元素的一个段加上一个排队元素,另一个段包含剩余的9个元素。 At this point all 41 elements are referenced, so if each element is 25MB, your collection will be taking up 1GB! 此时所有41个元素都被引用,因此如果每个元素为25MB,那么您的集合将占用1GB! Once the next item is consumed, all 32 of the elements in the head segment will be unreferenced and can be collected. 消耗下一个项目后,头部分段中的所有32个元素将被取消引用并可以被收集。

You may think there should only ever need to be 10 elements in the queue, and that would be the case for a non-concurrent queue, but that would not allow one thread to enumerate the elements in the queue while another thread was producing or consuming elements. 您可能认为队列中应该只需要10个元素,而非并发队列就是这种情况,但这不允许一个线程枚举队列中的元素而另一个线程正在生成或消耗元素。

The reason that the .Net 4.5 framework doesn't leak is that they changed the behavior to null out elements as soon as they're produced as long as there is nobody enumerating the queue. .Net 4.5框架不泄漏的原因是,只要没有人枚举队列,他们就会在生成元素时将行为更改为null。 If you start enumerating collection , you should see memory leak even with the .Net 4.5 framework. 如果您开始枚举collection ,即使使用.Net 4.5框架,也应该看到内存泄漏。

The reason that setting _list = null works when you have a class is that you are creating a "box" wrapper that allows you to unreference the list in every place that it's used. 当你有一个class时,设置_list = null的原因是你正在创建一个“box”包装器,它允许你在它使用的每个地方取消引用列表。 Setting the value in your local variable changes the same copy that the queue has a reference to. 设置局部变量中的值会更改队列引用的同一副本。

The reason that setting _list = null doesn't work when you have a struct is that you can only ever change copies of a struct . 当你有一个struct时,设置_list = null不起作用的原因是你只能更改struct副本。 The "original" version of it sitting in that queue segment is effectively immutable because ConcurrentQueue doesn't provide a way to change it. 它位于该队列段中的“原始”版本实际上是不可变的,因为ConcurrentQueue不提供更改它的方法。 In other words, you're changing only the copy of the value in your local variable rather than chaging the copy in the queue. 换句话说,您只更改本地变量中值的副本,而不是更改队列中的副本。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM