简体   繁体   English

C#Zip函数迭代两个对象列表

[英]C# Zip Function to iterate through two lists of objects

I have a program that creates a list of objects from a file, and also creates a list of the same type of object, but with fewer/and some different properties, from the database, like: List from FILE: Address ID, Address, City, State, Zip, other important properties list from DB: Address ID, Address, City, State 我有一个程序可以从文件中创建对象列表,还可以从数据库中创建具有相同类型对象但具有更少/和一些不同属性的对象列表,例如:FILE中的列表:地址ID,地址,城市,州,邮编以及数据库中的其他重要属性列表:地址ID,地址,城市,州

I have implemented IEquatable on this CustObj so that it only compares against Address, City, and State, in the hopes of doing easy comparisons between the two lists. 我已在此CustObj上实现了IEquatable,以便仅与地址,城市和州进行比较,以期在两个列表之间进行轻松的比较。

The ultimate goal is to get the address ID from the database and update the address IDs for each address in the list of objects from the file. 最终目标是从数据库中获取地址ID,并从文件中更新对象列表中每个地址的地址ID。 These two lists could have quite a lot of objects (over 1,000,000) so I want it to be fast. 这两个列表可能有很多对象(超过1,000,000个),所以我希望它能更快。

The alternative is to offload this to the database and have the DB return the info we need. 另一种方法是将其卸载到数据库,并让数据库返回我们所需的信息。 If that would be significantly faster/more resource efficient, I will go that route, but I want to see if it can be done quickly and efficiently in code first. 如果那将显着提高速度/提高资源效率,我会走这条路线,但我想先看看是否可以在代码中快速高效地完成它。

Anyways, I see there's a Zip method. 无论如何,我看到有一个Zip方法。 I was wondering if I could use that to say "if there's a match between the two lists, keep the data in list 1 but update the address id property of each object in list 1 to the address Id from list 2". 我想知道我是否可以这样说:“如果两个列表之间存在匹配,请将数据保留在列表1中,但将列表1中每个对象的地址ID属性更新为列表2中的地址ID”。

Is that possible? 那可能吗?

The answer is, it really depends. 答案是,这确实取决于。 There are a lot of parameters you haven't mentioned. 您没有提到很多参数。

The only way to be sure is to build one solution (preferably using the zip method, since it has less work involved) and if it works within the parameters of your requirements (time or any other parameter, memory footprint?), you can stop there. 确保唯一的方法是构建一个解决方案(最好使用zip方法,因为它涉及的工作较少),并且如果该解决方案在您的要求参数范围内(时间或其他任何参数,内存占用量?),您可以停止那里。

Otherwise you have to off load it to the database. 否则,您必须将其卸载到数据库。 Mind you, you would have to hold the 1 million records from files and 1 million records from DB in memory at the same time if you want to use the zip method. 请注意,如果要使用zip方法,则必须同时将文件中的100万条记录和数据库中的100万条记录保存在内存中。

The problem with pushing everything to the database is, inserting that many records is resource (time, space etc) consuming. 将所有内容推送到数据库的问题是,插入许多记录会消耗资源(时间,空间等)。 Moreover if you want to do that maybe everyday, it is going to be more difficult, resource wise. 而且,如果您想每天这样做,那么从资源的角度来看,它将变得更加困难。

Your question didn't say if this was going to be a one time thing or a daily event in a production environment. 您的问题并没有说这是一次性的还是生产环境中的日常事件。 Even that is going to make a difference in which approach to choose. 即便如此,这也会影响选择哪种方法。

To repeat, you would have to try different approaches to see which will work best for you based on your requirements: is this a one time thing? 要重复一遍,您将不得不尝试不同的方法,以根据您的要求查看哪种方法最适合您:这是一次性的吗? How much resources does the process have? 该流程有多少资源? How much time does it have? 它有多少时间? and possibly many more. 可能还有更多。

It kindof sounds also like a job for .Aggregate() aka 听起来也像.Aggregate()的工作

var aggreg = list1.Aggregate(otherListPrefilled, (acc,elemFrom1) => 
{  
     // some code to create the joined data, usig elemFrom1 to find
     // and modify the correct element in otherListPrefilled
    return acc;
});

normally I would use an empty otherListPrefilled, not sure how it performs on 100k data items though. 通常我会使用一个空的otherListPrefilled,但是不确定它如何对100k数据项执行。

If its a onetime thing, its probably faster to put your file to a csv, import it in your database as temporary table and join the data in sql. 如果是一次性的,将文件放到csv,将其作为临时表导入数据库并在sql中加入数据的速度可能更快。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM