改善清单的效能<T>

Question

I have a class called Product that contains some properties like Id (as Guid) and Messages (as List), and a Message class that also contains Id and other properties. 我有一个名为Product的类，其中包含一些属性，例如Id（如Guid）和Messages（如List），以及一个Message类，它也包含Id和其他属性。 I have all messages in Message table and all products in product table. 我在消息表中有所有消息，在产品表中有所有产品。 After getting data of both tables, I want to join them regarding on Id property. 在获得两个表的数据之后，我想加入它们有关Id属性的内容。 If I use the below code as it is linear search the performance is terrible. 如果我使用下面的代码，因为它是线性搜索，那么性能会很糟糕。

foreach (Product product in products)
    product.Messages = messages.Where(n => n.Id == product.Id).ToList();

Are there any other ways to do it faster? 还有其他方法可以更快地做到吗？

Thanks 谢谢

Answer 1

You might be able to speed it up by groupding your messages into a lookup table. 您可以通过将消息分组到查找表中来加快处理速度。

messagesDict = messages
    .GroupBy(x => x.Id)
    .ToDictionary(x => x.Id, x.ToList());

or, as John Bustos suggested, you can use ToLookup(); 或者，按照John Bustos的建议，您可以使用ToLookup（）;

messagesDict = messages
    .ToLookup(x => x.Id);

you use it like this 你这样用

//you might have to first check if messagesDict 
//actually has any messages for your project.
product.Messages = messagesDict[product.Id];

Your original attempt is O(nm) where n is the number of projects and m is the number of messages. 您最初的尝试是O（nm） ，其中n是项目数， m是消息数。

A Dictionary uses hashing, so from a practical standpoint, you can usually assume that it has close to O(1) inserts, and O(1) searches. Dictionary使用散列，因此从实际的角度来看，您通常可以假定它具有接近O（1）的插入和O（1）搜索。 Under ideal circumstances , List<T>.Add is also O(1) . 在理想情况下， List<T>.Add也是O（1） 。 This means that if you were to manually create your lookup dictionary, then, you could do it in O(m) . 这意味着，如果要手动创建查找字典，则可以在O（m）中进行 。 I would hope that a built-in function like ToLookup , achieves the same efficiency. 我希望像ToLookup这样的内置函数能够达到相同的效率。

Once you do that, your algorthim becomes O(n + m) 完成后，您的算法将变为O（n + m）

Answer 2

You should be doing the join in the database. 您应该在数据库中进行连接。 That'll yield the best performance. 这将产生最佳性能。 If you insist on doing this in C# sort product by Id and sort messages by ID first. 如果您坚持要在C＃中按ID对产品进行排序，并首先按ID对消息进行排序。

Answer 3

As others have indicated, do the join in the database. 正如其他人指出的那样，在数据库中进行联接。 As you indicated, a linear search is O(n) (and you're actually doing quite a few linear searches in this case); 正如您所指出的，线性搜索为O（n）（在这种情况下，您实际上进行了很多线性搜索）； however, most databases use a B-Tree data structure (or something similar) to sort rows by primary key. 但是，大多数数据库使用B-Tree数据结构（或类似的东西）通过主键对行进行排序。 This means that a database search on a primary key is O(log n), which is obviously dramatically faster. 这意味着，在主键的数据库搜索是O（log n）的，这显然大大加快。 (Assuming, of course, that the Id is the primary key). （当然，假设ID是主键）。

改善清单的效能<T>

问题描述

3 个解决方案

解决方案1
2 已采纳 2016-09-01 18:39:44

解决方案2
0 2016-09-01 18:40:03

解决方案3
0 2016-09-01 18:43:42

改善清单的效能<T>

问题描述

3 个解决方案

解决方案1 2 已采纳 2016-09-01 18:39:44

解决方案2 0 2016-09-01 18:40:03

解决方案3 0 2016-09-01 18:43:42

解决方案1
2 已采纳 2016-09-01 18:39:44

解决方案2
0 2016-09-01 18:40:03

解决方案3
0 2016-09-01 18:43:42