简体   繁体   English

通过字符串C#在很大的列表中搜索对象的最快方法

[英]Quickest way to search for objects in a very large list by string C#

for example i have a class like below : 例如我有一个像下面这样的类:

public class MasterRecord
{
     public int Id { get; set; }
     public string UniqueId{ get; set; }
}

public class DetailRecord
{
     public int Id { get; set; }

     public int MasterRecordId { get; set; }

     public string UniqueId{ get; set; }
}

and i also 2 list which are: 我也列出了两个:

MasterList and DetailList MasterList和DetailList

MasterList will have around 300,000 records, DetailList will have around 7,000,000 records MasterList大约有300,000条记录,DetailList大约有7,000,000条记录

What i need is loop for every record in the Master List and search the records which has same Name in DetailList. 我需要的是对主列表中的每个记录进行循环,并在DetailList中搜索具有相同名称的记录。

Here are my code : 这是我的代码:

 foreach (var item in MasterList)
 {
    var matchPersons = DetailList.Where(q => q.UniqueId == item .UniqueId).ToList();

    if (matchPersons != null && matchPersons.Count() > 0)
    {
        foreach (var foundPerson in matchPersons)
        {
            //Do something with foundPerson
            foundPerson.MasterRecordId = item.Id;
        }
    }
 }

My code running very slow now , each search cost me 500 millisecond to finish , so with 300k records, it will take 2500 minutes :( to finish . Is there any other way to fast up this function ? Thanks and forgive for my poor English . 我的代码运行非常慢,每次搜索要花费我500毫秒才能完成,因此,要完成300k记录,将花费2500分钟:(要完成。还有其他方法可以加快此功能吗?谢谢您,我的英语不好。

Updated code for make it more clearer of what i want to do. 更新的代码使它更清晰地说明了我想做什么。

Using some hash structure would be one of the best options: 使用某种哈希结构将是最好的选择之一:

var detailLookup = DetailList.ToLookup(q => q.Name);
foreach (var person in MasterList)
{
    foreach (var foundPerson in detailLookup[person.Name])
    {
        //Do something with foundPerson                
    }
}

Lookup returns empty sequence if the key is not present, so you do not have to test it. 如果键不存在,查找将返回空序列,因此您不必对其进行测试。

您可以使用“加入名称”。

var result = masterList.Join(detailedList,m=>m.Name,d=>d.Name,(m,d)=>d);

If you need to handle "MasterRecords with their DetailRecords", don't use a normal join, use a GroupJoin. 如果您需要处理“ MasterRecords及其DetailRecords”,请不要使用普通联接,而应使用GroupJoin。 This will internally create something similar to a LookupTable. 这将在内部创建类似于LookupTable的内容。

The nice thing is that this will also work with databases, CSV-files, or whatever method that you use to get your records. 令人高兴的是,这也可以与数据库,CSV文件或用于获取记录的任何方法一起使用。 You don't have to convert them into lists first. 您不必先将它们转换为列表。

// Your input sequences, if desired: use IQueryable
IEnumerable<MasterRecord> masterRecords = ...
IEnumerable<DetailRecord> detailRecords = ...
// Note: query not executed yet!

// GroupJoin these two sequences
var masterRecordsWithTheirDetailRecords = masterRecord.GroupJoin(detailRecords,
    masterRecord => masterRecord.Id,             // from masterRecord take the primary key
    detailRecord => detailRecord.MasterRecordId  // from detailRecord take the foreign key

    // ResultSelector: from every MasterRecord with its matching DetailRecords select
    (masterRecord, detailRecords) => new
    {
        // select the properties you plan to use:
        Id = masterRecord.Id,
        UniqueId = maserRecord.UniqueId,
        ...

        DetailRecords = detailRecords.Select(detailRecord => new
        {
            // again: select only the properties you plan to use
            Id = detailRecord.Id,
            ...

            // not needed, you know the value:
            // MasterRecordId = detailRecord.MasterRecordId,
        }),
        // Note: this is still an IEnumerable!            
     });

Usage: 用法:

foreach(var masterRecord in masterRecordsWithTheirDetailRecords)
{
    ... // process the master record with its detail records
}

The nice thing is, that is you have only need to process some of the MasterRecords (for instance, after the 1000th you decide that you found what you searched for), or if you have some MasterRecords of which you don't need all DetailRecords, no more records are processed than necessary. 令人高兴的是,您只需要处理一些MasterRecords(例如,在第1000个记录之后,您就决定找到要搜索的内容),或者如果您有一些MasterRecords,它们不需要所有的DetailRecords ,不会处理多余的记录。 Linq will take care of that Linq会照顾好这个

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM