简体   繁体   English

改善清单的效能<T>

[英]Improve performance of List<T>

I have a class called Product that contains some properties like Id (as Guid) and Messages (as List), and a Message class that also contains Id and other properties. 我有一个名为Product的类,其中包含一些属性,例如Id(如Guid)和Messages(如List),以及一个Message类,它也包含Id和其他属性。 I have all messages in Message table and all products in product table. 我在消息表中有所有消息,在产品表中有所有产品。 After getting data of both tables, I want to join them regarding on Id property. 在获得两个表的数据之后,我想加入它们有关Id属性的内容。 If I use the below code as it is linear search the performance is terrible. 如果我使用下面的代码,因为它是线性搜索,那么性能会很糟糕。

foreach (Product product in products)
    product.Messages = messages.Where(n => n.Id == product.Id).ToList();

Are there any other ways to do it faster? 还有其他方法可以更快地做到吗?

Thanks 谢谢

You might be able to speed it up by groupding your messages into a lookup table. 您可以通过将消息分组到查找表中来加快处理速度。

messagesDict = messages
    .GroupBy(x => x.Id)
    .ToDictionary(x => x.Id, x.ToList());

or, as John Bustos suggested, you can use ToLookup(); 或者,按照John Bustos的建议,您可以使用ToLookup();

messagesDict = messages
    .ToLookup(x => x.Id);

you use it like this 你这样用

//you might have to first check if messagesDict 
//actually has any messages for your project.
product.Messages = messagesDict[product.Id];

Your original attempt is O(nm) where n is the number of projects and m is the number of messages. 您最初的尝试是O(nm) ,其中n是项目数, m是消息数。

A Dictionary uses hashing, so from a practical standpoint, you can usually assume that it has close to O(1) inserts, and O(1) searches. Dictionary使用散列,因此从实际的角度来看,您通常可以假定它具有接近O(1)的插入和O(1)搜索。 Under ideal circumstances , List<T>.Add is also O(1) . 在理想情况下List<T>.Add也是O(1) This means that if you were to manually create your lookup dictionary, then, you could do it in O(m) . 这意味着,如果要手动创建查找字典,则可以在O(m)中进行 I would hope that a built-in function like ToLookup , achieves the same efficiency. 我希望像ToLookup这样的内置函数能够达到相同的效率。

Once you do that, your algorthim becomes O(n + m) 完成后,您的算法将变为O(n + m)

You should be doing the join in the database. 您应该在数据库中进行连接。 That'll yield the best performance. 这将产生最佳性能。 If you insist on doing this in C# sort product by Id and sort messages by ID first. 如果您坚持要在C#中按ID对产品进行排序,并首先按ID对消息进行排序。

As others have indicated, do the join in the database. 正如其他人指出的那样,在数据库中进行联接。 As you indicated, a linear search is O(n) (and you're actually doing quite a few linear searches in this case); 正如您所指出的,线性搜索为O(n)(在这种情况下,您实际上进行了很多线性搜索); however, most databases use a B-Tree data structure (or something similar) to sort rows by primary key. 但是,大多数数据库使用B-Tree数据结构(或类似的东西)通过主键对行进行排序。 This means that a database search on a primary key is O(log n), which is obviously dramatically faster. 这意味着,在主键的数据库搜索是O(log n)的,这显然大大加快。 (Assuming, of course, that the Id is the primary key). (当然,假设ID是主键)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM