[英]How to extract header from csv in c#
Im loading and splitting couple csv files into two lists in c#. 我在c#中将一对csv文件加载并拆分为两个列表。 Now I also need to extract the header from the first line with the ; 现在我还需要从第一行中提取标题; as delmiter
. 作为delmiter
。 I'm trying to use the .Skip(1) command but that only skips (obviously) but I need to extract the header and after my work with the rest of the data is done add it again as the first line. 我正在尝试使用.Skip(1)命令,但这仅跳过(很明显),但是我需要提取标题,在处理完其余数据后,将其再次添加为第一行。
Here is what I have tried so far: 到目前为止,这是我尝试过的:
string[] fileNames = Directory.GetFiles(@"read\", "*.csv");
for (int i = 0; i < fileNames.Length; i++)
{
string file = @"read\" + Path.GetFileName(fileNames[i]);
var lines = File.ReadLines(file).Skip(1);
(List<string> dataA, List<string> dataB) = SplitAllTodataAAnddataB(lines);
var rowLog = 0;
foreach (var line in dataA)
{
// Variablen für lines
string[] entries = line.Split(';');
rowLog++;
Helper.checkdataAString(entries[0].ToLower(), "abc", rowLog);
Helper.checkdataAString(entries[1].ToLower(), "firstname", rowLog);
Helper.checkdataAString(entries[2].ToLower(), "lastname", rowLog);
Helper.checkdataAString(entries[4].ToLower(), "gender", rowLog);
Helper.checkdataAString(entries[5].ToLower(), "id", rowLog);
Helper.checkdataAString(entries[3], "date", rowLog);
Helper.drawTextProgressBar("loaded rown", rowLog, dataA.Count());
}
Console.WriteLine("\nencryypting data");
var output = new List<string>();
foreach (var line in dataA)
{
try
{
string[] entries = line.Split(';');
string abc = entries[0].ToLower();
string firstName = koeln.GetPhonetics(entries[1]).ToLower();
string lastName = koeln.GetPhonetics(entries[2]).ToLower();
string date = entries[3];
//Hier werden die drei vorherigen Variablen konkatiniert.
string NVG = FirstName + "_" + LastName + "_" + BirthDate;
string gender = entries[4].ToLower();
string age = Helper.Left(Convert.ToString(20171027 - Convert.ToInt32(entries[3])), 2);
string zid = Guid.NewGuid().ToString();
string fid = entries[5].ToLower();
rowdataA++;
output.Add($"{abc}; {NVG}; {gender}; {age}; {zid}; {fid}");
Helper.drawTextProgressBar("encrypted rows.", rowdataA, dataA.Count());
}
catch { rowdataA++; }
}
File.WriteAllLines(fileTest, output);
}
I'm kinda new to developing so im just trying and any help would be appreciated. 我对开发有点陌生,所以我只是尝试,任何帮助将不胜感激。
You can read file this way: 您可以通过以下方式读取文件:
string file = @"read\" + Path.GetFileName(fileNames[i]);
var content = File.ReadLines(file);
var header = content.ElementAt(0);
var lines = content.Skip(1);
List<string> lines = File.ReadLines(file);
This contains all the lines from the file. 这包含文件中的所有行。 We know that the first line is the header, and the rest is the content. 我们知道第一行是标题,其余的是内容。
List<string> contentLines = lines.Skip(1);
This is what you had in your code. 这就是代码中的内容。 It contains all lines except the first. 它包含除第一行外的所有行。
So how do we get only the header line? 那么,如何只获得标题行呢?
string headerLine = lines.First();
There we go. 好了 Notice that this returns a single string, not a list of strings. 请注意,这将返回单个字符串,而不是字符串列表。
If you want to receive a list of strings (eg if you have a header that spans two or more lines), then you can do: 如果要接收字符串列表(例如,如果标题具有跨越两行或更多行的内容),则可以执行以下操作:
List<string> headerLines = lines.Take(amount_of_header_lines);
List<string> contentLines = lines.Skip(amount_of_header_lines);
Simply put, Take(X)
takes the first X items, and Skip(X)
takes everything except the first X items. 简而言之, Take(X)
接收前X个项目, Skip(X)
接收除前X个项目之外的所有内容。
lines = File.ReadLines(file)
in a separate variable first. 请注意,我首先将lines = File.ReadLines(file)
放在单独的变量中。 If I had called File.ReadLines(file)
for both the header lines and the content lines (instead of using the lines
variable), I would have read the file twice. 如果我已经为标题行和内容行都调用了File.ReadLines(file)
(而不是使用lines
变量),那么我将读取该文件两次。 That may not matter to you now, but it can lead to performance issues and it's pointless work. 现在,这对您可能并不重要,但是它可能导致性能问题,并且没有意义。 Single
. 我使用Single
。 You might want to use SingleOrDefault
(or you might not). 您可能要使用SingleOrDefault
(或可能不会)。 But that ties into a different discussion that is not the focus here. 但这与此处不是重点的其他讨论相关联。 ColumnA;"ColumnB;StillColumnB";ColumnC
. 例如,请注意,此数据仅表示三列: ColumnA;"ColumnB;StillColumnB";ColumnC
。 Your code ( line.Split(';')
) will not account for that. 您的代码( line.Split(';')
)将不予考虑。 File.ReadLines()
does not account for that. File.ReadLines()
不能解决这个问题。 If I understood correctly, you need to read the whole file, process all the lines except the header, then write back a different file with the header and the processed lines, right? 如果我理解正确,则需要读取整个文件,处理除标题以外的所有行,然后用标题和处理过的行写回另一个文件,对吗?
If so, the following approach should work: 如果是这样,则应采用以下方法:
var allLines = File.ReadAllLines(originalFile);
var headerLine = allLines.First();
var dataLines = allLines.Skip(1);
var processedLines = ProcessLines(dataLines);
File.WriteAllLines(newFile, (new[] {headerLine}.Concat(processedLines)).ToArray());
The ProcessLines
method would accept the original lines as parameter and return a list with the processed lines: ProcessLines
方法将接受原始行作为参数,并返回包含已处理行的列表:
IEnumerable<string> ProcessLines(IEnumerable<string> originalLines)
{
var processedLines = new List<string>();
foreach(var line in originalLines)
{
var processedLine = //generate your processed line here
processedLines.Add(processedLine);
}
return processedLines;
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.