简体   繁体   English

使用标头c#拆分CSV文件

[英]Split CSV files with header c#

I need to split large CSV files by the source field and name the export files the name as the source field. 我需要通过源字段拆分大型CSV文件,并将导出文件命名为源字段。

My code works, but the only thing that's not working is I need the split files to have the header row from the original file. 我的代码有效,但唯一不起作用的是我需要拆分文件从原始文件中获取标题行。

Any help is appreciated. 任何帮助表示赞赏。 Thank you. 谢谢。

var splitQuery = from line in File.ReadLines(@"C:\test\test1.csv")
            let source = line.Split(',').Last()
            group line by source into outputs
            select outputs;

foreach (var output in splitQuery)
{
    File.WriteAllLines(@"C:\test\" + output.Key + ".csv", output);
}

Im not sure how to add a snippet of the CSV but ive put a snippet of the header fields, hope this helps 我不知道如何添加CSV的片段,但我已经把一个标题字段的片段,希望这有帮助

ID ,Ref ,Title ,Initials ,Forename ,Surname ,File_Source ID,Ref,Title,Initials,Forename,Surname,File_Source

I'm strongly recommend to use specialized library for parsing CSV files that handles first line as headers and everything else. 我强烈建议使用专用库来解析CSV文件,将第一行作为标题和其他所有内容处理。 CSV format is not simple as it might look from the first sight - for example, values may be in quotes ("value"), and quotes may be escaped inside values. CSV格式并不简单,因为它可能从第一眼看起来 - 例如,值可能在引号(“值”)中,并且引号可能在值内转义。

Personally I prefer to use CSVHelper - it is suitable both for classic .NET and .NET Core: 我个人更喜欢使用CSVHelper - 它适用于经典的.NET和.NET Core:

using (var fileRdr = new StreamReader(@"C:\test\test1.csv")) {
    var csvRdr = new CsvReader( fileRdr, 
                       new CsvConfiguration() { HasHeaderRecord = true } );
    while( csvRdr.Read() )
    {
        // list of csv headers
        var csvFields = csvRdr.FieldHeaders

        // get individual value by field name
        var sourceVal = csvRdr.GetField<string>( "File_Source" );

        // perform your data transformation logic here 
    }   
}

Simply read the header line first: 首先阅读标题行:

var fileLinesIterator = File.ReadLines(...);

string headerLine = fileLinesIterator.Take(1);

Then prepend it to every output: 然后将它添加到每个输出:

var splitQuery = from line in fileLinesIterator

// ...


    File.WriteAllLines(@"C:\test\" + output.Key + ".csv", headerLine + "\r\n" + output);

But apart from that, you don't want to be handling CSV files as mere (lines of) strings. 但除此之外,您不希望将CSV文件处理为仅仅(行)字符串。 You're bound to running into running into trouble with quoted and multiline values. 你一定会遇到引用和多线值的问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM