简体繁体 English

比较csv中的记录和大列表

[英]Comparing a record from csv with large list

原文 2013-02-19 07:06:38 3 1 c#/ performance/ csv

In my website user uploads a csv file. 用户在我的网站上上传一个csv文件。

I am reading a csv file using this library http://www.codeproject.com/Articles/11698/A-Portable-and-Efficient-Generic-Parser-for-Flat-F The csv file will have around 4000 records (each record with 5 columns). 我正在使用此库http://www.codeproject.com/Articles/11698/A-Portable-and-Efficient-Generic-Parser-for-Flat-F阅读csv文件csv文件将包含约4000条记录（每条记录记录5列）。

I am reading each record in to a List and search in a large list of objects(Before started reading csv file I am reading the large list of objects from a service to the cache.) to check whether this record already exist or not. 我正在将每个记录读入一个列表并搜索一个大对象列表（在开始读取csv文件之前，我正在从服务到缓存中读取一个大对象列表。）以检查该记录是否已经存在。

In this way I have to do 4000 iterations and in each iteration I have to search in large list of objects ( around 100 thousands records which are in cache). 这样，我必须进行4000次迭代，并且每次迭代都必须在大型对象列表中进行搜索（缓存中约有10万条记录）。

Is this the good way of implementation? 这是实现的好方法吗？ Is there any way to improve the speed? 有什么办法可以提高速度？ Is it good idea to store such a large list in cache? 将这么大的列表存储在缓存中是个好主意吗？

My environment is VS2010, .NET4.0, 我的环境是VS2010，.NET4.0，

1 个解决方案

You can speed up your search by using an appropriate data structure for your lists. 您可以通过为列表使用适当的数据结构来加快搜索速度。 If the items have a unique/primary key you could use a hashmap which would be more efficient than iterating the whole list for each item. 如果项目具有唯一/主键，则可以使用哈希表，该哈希表比为每个项目迭代整个列表更有效。 This way you could use hashmap.containskey(). 这样，您可以使用hashmap.containskey（）。

if you run the service you could push the responsibility up to the service - perhaps pushing the list of unique keys there for comparison. 如果您运行该服务，则可以将责任推到该服务上-也许将唯一键列表推向那里进行比较。

maybe you could post some code for a more specific answer. 也许您可以发布一些代码以获得更具体的答案。