简体   繁体   中英

EntityFramework6 memory usage with large amount of table due to InitializedDatabases list

In our application there are a large amount of tables (around 50k) - all of those tables are actually used and this results in a high memory consumption in entity framework.

After some memory profiling I noticed that DbCompiledModel classes were being kept in memory so after some searching tracked it down to the LazyInternalContext class that keeps a list of "InitializedDatabases".

https://github.com/dotnet/ef6/blob/master/src/EntityFramework/Internal/LazyInternalContext.cs#L670

Is there a way to prevent entity framework from doing this?, it's not a code first setup, database setup and migration are not done in this app if that is what the "InitializeDatabaseAction" implies.

Setting a "return null" or setting "InitializerDisabled" to true makes everything work but would rather not run a custom entity build plus don't know what the impact would be to just 'change' the source.

Most tables have the same definition so also tried the solution I found here: Change table name at runtime

When trying this I'm getting an error "An open data reader exists for this command", using postgres and MARS isn't supported there (no idea why I'd need it, this just changes the sql that's run)

The solution was given in a comment bu Ivan Stoev and works.

There is no way to turn this off without using reflection, setting the "InternalContext.InitializerDisabled" property to true will make this skip the dictionary.

So:

  • Use a DbContext constructor that provides the DbCachedModel
  • Use Database.SetInitializer(null);
  • Set InternalContext.InitializerDisabled = true using reflection

Code from the sample I used to test this, as a test setup I had 1 main table with 30k partitions, the partitions themselves are queried because postgres (especialy 9.x) does not scale well with high number of partitions:

    public class PartContext : DbContext {
        private static readonly string _ConnectionString = new NpgsqlConnectionStringBuilder {
            Host = "localhost",
            Port = 5432,
            Database = "postgres",
            Username = "postgres",
            Password = "password"
        }.ConnectionString;

        public readonly string Table;
        public readonly string Partition;

        public PartContext(string pMainTable, string pPartition) : base(
            new NpgsqlConnection() { ConnectionString = _ConnectionString },
            PartDbModelBuilder.Get(_ConnectionString, pPartition),
            true
        ) {
            Table = pMainTable;
            Partition = pPartition;

            Database.SetInitializer<PartContext>(null);


            /**
             * Disable database initialization so that the DbCachedModels are not kept internally in Entity
             * This causes high memory usage when having a lot of tables 
             * In EF 6.4.2 there was no way to 'manage' that Dictionary externally
             */
            try {
                var InternalContext = typeof(PartContext).BaseType.GetProperty("InternalContext", BindingFlags.NonPublic | BindingFlags.Instance).GetValue(this, null);
                InternalContext.GetType().GetProperty("InitializerDisabled").SetValue(InternalContext, true);
            } catch(Exception) { }
        }

        public DbSet<MyPart> Parts { get; set; }

        protected override void OnModelCreating(DbModelBuilder modelBuilder) {
            modelBuilder.HasDefaultSchema("public");
            modelBuilder.Conventions.Remove<PluralizingTableNameConvention>();
        }
    }

This provides the DbCachedModels:

I recommend adding some custom caching code etc, this is just from a sample

    class PartDbModelBuilder {
        public static DbCompiledModel Get(string pConnectionString, string pTable) {
            DbModelBuilder builder = new DbModelBuilder();
            builder.Entity<MyPart>().ToTable(pTable, "public");
            using (var connection = new NpgsqlConnection() { ConnectionString = pConnectionString }) {
                var obj = builder.Build(connection).Compile();
                return obj;
            }
        }
    }

This is the entity I used as a test:

    public class MyPart {
        public int id { get; set; }
        public string name { get; set; }
        public string value { get; set; }
    }

Class I used to run the test:

    class EFTest {
        public void Run(int tableCount) {
            int done = 0;
            Parallel.For(0, tableCount, new ParallelOptions { MaxDegreeOfParallelism = 5 }, (i) => {
                string id = i.ToString().PadLeft(5, '0');
                using (var context = new PartContext("mypart", "mypart_" + id)) {
                    var objResult = context.Parts.First();
                    Console.WriteLine(objResult.name);
                }
                done++;
                Console.WriteLine(done + " DONE");
            });
        }
    }

Table definition:

    CREATE TABLE IF NOT EXISTS mypart (
        id SERIAL,
        name text,
        value text
    ) partition by list (name);

    CREATE TABLE IF NOT EXISTS part partition of mypart_00000 for values in ('mypart00000');
    CREATE TABLE IF NOT EXISTS part partition of mypart_00001 for values in ('mypart00001');
    CREATE TABLE IF NOT EXISTS part partition of mypart_00002 for values in ('mypart00002');
    ...

Postgres 9:

    CREATE TABLE IF NOT EXISTS mypart (
        id SERIAL,
        name text,
        value text
    );

    CREATE TABLE IF NOT EXISTS ".$name."( CHECK ( name =  'mypart00000')) INHERITS (mypart);
    CREATE TABLE IF NOT EXISTS ".$name."( CHECK ( name =  'mypart00001')) INHERITS (mypart);
    CREATE TABLE IF NOT EXISTS ".$name."( CHECK ( name =  'mypart00002')) INHERITS (mypart);
    ...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM