[英]Faster way to search string in a large csv file C#
I am having 我有
Now, I need to match AccId and have to find its corresponding External_ID from the csv file. 现在,我需要匹配AccId并且必须从csv文件中找到它对应的External_ID。
Currently I am achieving it using below code: 目前我正在使用以下代码实现它:
DataTable tblATL = Util.GetTable("ATL", false);
tblATL.Columns.Add("External_ID");
DataTable tbl = Util.CsvToTable("TT.csv", true);
foreach (DataRow columnRow in tblATL.Rows)
{
var query = tbl.Rows.Cast<DataRow>().FirstOrDefault(x => x.Field<string>("AccId") == columnRow["AccId"].ToString());
if (query != null)
{
columnRow["External_ID"] = query.Field<string>("External_ID");
}
else
{
columnRow["External_ID"] = "New";
}
}
This code is working well but only problem is a performance issue, its taking very very long time to get the result. 此代码运行良好,但只有问题是性能问题,它需要很长时间才能得到结果。
Please help. 请帮忙。 How can I improve its performance, do you have any other approach?
如何改善其性能,您还有其他方法吗?
I suggest organizing data into a dictionary , say, Dictionary<String, String[]>
which has O(1)
time complexity, eg 我建议将数据组织到字典中 ,例如,
Dictionary<String, String[]>
,其具有O(1)
时间复杂度,例如
Dictionary<String, String[]> Externals = File
.ReadLines(@"C:\MyFile.csv")
.Select(line => line.Split(',')) // the simplest, just to show the idea
.ToDictionary(
items => items[0], // let External_ID be the 1st column
items => items // or whatever record representation
);
....
String externalId = ...
String[] items = Externals[externalId];
EDIT : if same External_ID
can appear more than once (see comments below) you have to deal with duplicates, eg 编辑 :如果相同的
External_ID
可以出现多次 (见下面的评论),你必须处理重复,例如
var csv = File
.ReadLines(@"C:\MyFile.csv")
.Select(line => line.Split(',')) // the simplest, just to show the idea
Dictionary<String, String[]> Externals = new Dictionary<String, String[]>();
foreach (var items in csv) {
var key = items[0]; // let External_ID be the 1st column
var value = items; // or whatever record representation
if (!Externals.ContainsKey(key))
Externals.Add(key, value);
// else {
// //TODO: implement, if you want to deal with duplicates in some other way
//}
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.