简体   繁体   中英

Memory leak Entity Framework

I have a memory leak when I am using Entity Framework with SQL Server Compact Edition. My situation:

I have a file about 600MByte big. I read it line by line, create a entity class and added it to the SQL Server CE database. The memory is growing very fast by it. Gen 0 Collections counter and Gen 2 Heap Size is growing very fast (info from Process Explorer). If I understand right Gen 2 Heap is for big objects. I think my entity class is a big object. So Entity Framework saves my objects and do not release them. I already try to detach them and to call GC.Collect(2) but it does not help.

First I read the line. Then create a object after parsing the line. Then add it to the DB. Here is my database code:

DBEntities dbConnection = new DBEntities();
dbConnection.My_Table.AddObject(MyObjectCreatedFromTheLine);
dbConnection.SaveChanges();
//  dbConnection.Detach(MyObjectCreatedFromTheLine);
//  dbConnection.Dispose();
MyObjectCreatedFromTheLine = null;
dbConnection = null;

Also I read that the created entity class ( MyObjectCreatedFromTheLine ) belongs to DbContext . So I call this code for every line, creating each time a new context.

What am I doing wrong?

I ran into this problem trying to insert 50,000+ records into a SQL database using entity framework. The entity framework is not meant for huge bulk operations (large insert or delete operations) so I ended up using the System.Data.SqlClient.SqlBulkCopy library, which is much more efficient and faster. I even wrote the below helper function to auto-map so I didn't have to manually construct a SQL Insert statement. (it's marginally type independent! I think).

Basically the workflow is: IList<MyEntityType> -> DataTable -> SqlBulkCopy

public static void BulkInsert<T>(string connection, string tableName, IList<T> list)
    {
        using (var bulkCopy = new SqlBulkCopy(connection, SqlBulkCopyOptions.KeepNulls))
        {
            bulkCopy.BatchSize = list.Count;
            bulkCopy.DestinationTableName = tableName;
            bulkCopy.BulkCopyTimeout = 3000;

            var table = new DataTable();
            var props = TypeDescriptor.GetProperties(typeof(T))
                //Dirty hack to make sure we only have system data types 
                //i.e. filter out the relationships/collections
                                       .Cast<PropertyDescriptor>()
                                       .Where(propertyInfo => propertyInfo.PropertyType.Namespace.Equals("System"))
                                       .ToArray();

            foreach (var propertyInfo in props)
            {
                bulkCopy.ColumnMappings.Add(propertyInfo.Name, propertyInfo.Name);
                table.Columns.Add(propertyInfo.Name, Nullable.GetUnderlyingType(propertyInfo.PropertyType) ?? propertyInfo.PropertyType);
            }

            var values = new object[props.Length];
            foreach (var item in list)
            {
                for (var i = 0; i < values.Length; i++)
                {
                    values[i] = props[i].GetValue(item);
                }

                table.Rows.Add(values);
            }

            bulkCopy.WriteToServer(table);
        }
    }

In my example I went from 15-20 minutes to insert to under a minute.

I think your approach is not right. Just create one DBEntities object to save all of your changes. Something like the following may work;

using(DBEntities dbConnection = new DBEntities())
{
    foreach(MyObjectCreatedFromTheLine entity in ListOfMyObjectCreatedFromTheLine)
    {
        dbConnection.My_Table.AddObject(MyObjectCreatedFromTheLine);
    }
    dbConnection.SaveChanges();
}

You are creating a new DBEntities object foreach entity, which is simply not right. Just setting dbConnection to null does not mean that the object is disposed or garbage collector will not collect it. In fact, you are just setting the reference to null, the object is still in the memory and the garbage collector will collect the object.

I don't think adding huge number of entities through the data context is a best way to go. With each created object, you consume memory since the data context has an internal 1st level cache where objects remain until the context is disposed.

I don't know EF well and have no idea whether the cache can be cleared out everytime you persist a single object. However, I would rather opt to not to use the EF at all to perform massive inserts.

Instead, use the SqlBulkCopy class. It should resolve your memory issues and also it is an order of magnitude faster than anything you can achieve with EF and per-object inserts.

Get your DBEntities dbConnection = new DBEntities() out of the loop !?

Creating new object context on each iteration is as irrelevant as it's ridiculous.

Also it takes more time for allocation, especially for a big object like that, not to mention memory overhead and deallocation which probably is the problem.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM