简体   繁体   中英

How can I speed up instantiating a large collection of objects?

The code below is extremely slow for tables of any significant size. (100, 1000, etc...) The culprit is instantiating my objects with new T() . Note that this isn't my finalized code, I've just broken parts of it out in order to more easily profile. Instantiation and initialization will happen together once I refactor the code back into shape.

Is there any way to speed this up? I'm probably forgetting something really simple, or maybe I'm boned. Hopefully, the former.

public static IList<T> ToList<T>(this DataTable table) where T : Model, new()
{
    T[] entities = new T[table.Rows.Count];

    // THIS LOOP IS VERY VERY SLOW
    for (int i = 0; i < table.Rows.Count; i++)
        entities[i] = new T();

    // THIS LOOP IS FAST
    for (int i = 0; i < table.Rows.Count; i++)
        entities[i].Init(table, table.Rows[i]);

    return new List<T>(entities);
}

edit for more info:

The constructor of any given ModelType will look like this:

public ModelType()
{
    _modelInfo = new ModelTypeInfo();
}

The constructor of any given ModelTypeInfo will simply set some string and string[] values, and that class' only job is to provide the values set.

edit for even more info:

Since it seems to be a hot topic, here is what my method looks like for reals before breaking out object construction and initialization:

public static IList<T> ToList<T>(this DataTable table, ModelInfo modelInfo) where T : Model, new()
{
    var tempRepository = new Repository<T>(modelInfo);

    var list = new List<T>();
    foreach (DataRow row in table.Rows)
        list.Add(tempRepository.FromData(table, row));

    return list;
}

Under the covers, new T() generates a call to System.Activator.CreateInstance<T>() , which is (reflectively) slow:

L_0012: ldc.i4.0 
L_0013: stloc.1 
L_0014: br.s L_0026
L_0016: ldloc.0 
L_0017: ldloc.1 
L_0018: call !!0 [mscorlib]System.Activator::CreateInstance<!!T>()
L_001d: stelem.any !!T
L_0022: ldloc.1 
L_0023: ldc.i4.1 
L_0024: add 
L_0025: stloc.1 

You may wish to consider passing in a construction delegate instead.

The title of your question suggests that this has to do with the fact that the method is generic. Is allocating the same number of objects without generics faster? If not, it must be to do with whatever work's going on in your constructor. Can you post the constructor code?

EDITED Here is something I wrote awhile ago to cache constructors in a DynamicMethod, which is very fast:

In your class:

delegate T ConstructorDelegate();

The method body:

DynamicMethod method = new DynamicMethod(string.Empty, typeof(T), null,
    MethodBase.GetCurrentMethod().DeclaringType.Module);
ILGenerator il = method.GetILGenerator();
il.Emit(OpCodes.Newobj, type.GetConstructor(Type.EmptyTypes));
il.Emit(OpCodes.Ret);
var constructor = (ConstructorDelegate)method.CreateDelegate(typeof(ConstructorDelegate));

The problem is that the expression new T() actually uses reflection behind the scenes. (It calls Activator.CreateInstance ) Therefore, each call to it will take time.


One solution would be to constrain T to implement ICloneable. Then, you can write new T() once, and clone it in the loop. Obviously, you can only do that if you have full control over the model.


Another option would be to make the method take a creator delegate, like this:

public static IList<T> ToList<T>(this DataTable table, Func<T> creator) where T : Model {
    T[] entities = new T[table.Rows.Count];
    for (int i = 0; i < table.Rows.Count; i++)
        entities[i] = creator();

    //...
}

You would then call it like this:

table.ToList(() => new MyModelType());

Because it's used in a parameter, you wouldn't need to specify the generic type explicitly when calling the method.


The least intrusive method would be to use LINQ expressions to make your own creator methods.

EDIT : Like this:

static class CreatorFactory<T> where T : new() {
    public static readonly Func<T> Method = 
        Expression.Lambda<Func<T>>(Expression.New(typeof(T)).Compile();
}

public static IList<T> ToList<T>(this DataTable table) where T : Model {
    var entities = table.Rows.Select(r => CreatorFactory<T>.Method()).ToList();

    for (int i = 0; i < table.Rows.Count; i++)
        entities[i].Init(table, table.Rows[i]);

    return entities;
}

Do you really need a list, or would an IEnumerable be good enough? If so, you could do lazy/deferred creation of your objects:

public static IEnumerable<T> ToEnumerable<T>(this DataTable table) where T : Model, new()
{
    foreach (DataRow row in table.Rows)
    {
        T entity = new T();
        entity.Init(table, row);

        yield return entity;
    }
}

Unfortunately this is still likely to be slow, because most of the time is likely spent construction the object, but it might allow you to defer this load long enough to make the app appear faster, or until after you are able to filter some of the objects out entirely.

Also, you might think of implementing this using a Factory -like pattern:

public static IEnumerable<T> ToEnumerable<T>(this DataTable table, Func<DataRow, T> TFactory) 
{
    foreach (DataRow row in table.Rows)
    {
        yield return TFactory(row);
    }
}

Are you testing a release build?
Is tables.loop.count a simple property, and can you compare to hoisting it out of the loop?
What's the cost of instantiating T?
Does creating T allocate lots of small objects, so that you run into a few garbage collections?

To show by example, this method in C#:

public T Method<T>() where T : new()
{
    return new T();
}

is compiled to this MSIL code (from Reflector):

.method public hidebysig instance !!T Method<.ctor T>() cil managed
{
.maxstack 2
.locals init (
    [0] !!T CS$1$0000,
    [1] !!T CS$0$0001)
L_0000: nop 
L_0001: ldloca.s CS$0$0001
L_0003: initobj !!T
L_0009: ldloc.1 
L_000a: box !!T
L_000f: brfalse.s L_001c
L_0011: ldloca.s CS$0$0001
L_0013: initobj !!T
L_0019: ldloc.1 
L_001a: br.s L_0021
L_001c: call !!0 [mscorlib]System.Activator::CreateInstance<!!T>()
L_0021: stloc.0 
L_0022: br.s L_0024
L_0024: ldloc.0 
L_0025: ret 
}

To not go into too much internals, there are several steps here, several conditions being checked, the need to initialize data fields, etc, and finally a call to the Activator may be required. All this to instantiate a generic type object. And yes, this is used instead of a direct call to the type's constructor always.

Even if you need to use a list, why create the array first?

public static IList<T> ToList<T>(this DataTable table) where T : Model, new()
{
    var list = new List<T>();
    foreach (DataRow dr in table.Rows) {
        T entity = new T();
        entity.Init(table, dr);
        list.Add(entity);
    }
    return list;
}

For anyone running into this issue later on, this blog post was extremely helpful for me: http://blogs.msdn.com/haibo_luo/archive/2005/11/17/494009.aspx

Here are the changes I ended up making to my "factory" method. (not really a proper factory, but serves the purpose)

public class Repository<T> : IRepository<T> where T : Model, new()
{
    // ...

    private delegate T CtorDelegate();
    private CtorDelegate _constructor = null;
    private CtorDelegate Constructor
    {
        get
        {
            if (_constructor == null)
            {
                Type type = typeof(T);
                DynamicMethod dm = new DynamicMethod(type.Name + "Constructor", type, new Type[] { }, typeof(Repository<T>).Module);
                ILGenerator ilgen = dm.GetILGenerator();
                ilgen.Emit(OpCodes.Nop);
                ilgen.Emit(OpCodes.Newobj, type.GetConstructor(Type.EmptyTypes));
                ilgen.Emit(OpCodes.Ret);
                _constructor = (CtorDelegate)dm.CreateDelegate(typeof(CtorDelegate));
            }
            return _constructor;
        }
    }

    public T FromData(DataTable table, DataRow row)
    {
        T model = Constructor(); // was previously = new T();
        model.Init(table, row);
        return model;
    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM