简体   繁体   中英

Split CSV files with header c#

I need to split large CSV files by the source field and name the export files the name as the source field.

My code works, but the only thing that's not working is I need the split files to have the header row from the original file.

Any help is appreciated. Thank you.

var splitQuery = from line in File.ReadLines(@"C:\test\test1.csv")
            let source = line.Split(',').Last()
            group line by source into outputs
            select outputs;

foreach (var output in splitQuery)
{
    File.WriteAllLines(@"C:\test\" + output.Key + ".csv", output);
}

Im not sure how to add a snippet of the CSV but ive put a snippet of the header fields, hope this helps

ID ,Ref ,Title ,Initials ,Forename ,Surname ,File_Source

I'm strongly recommend to use specialized library for parsing CSV files that handles first line as headers and everything else. CSV format is not simple as it might look from the first sight - for example, values may be in quotes ("value"), and quotes may be escaped inside values.

Personally I prefer to use CSVHelper - it is suitable both for classic .NET and .NET Core:

using (var fileRdr = new StreamReader(@"C:\test\test1.csv")) {
    var csvRdr = new CsvReader( fileRdr, 
                       new CsvConfiguration() { HasHeaderRecord = true } );
    while( csvRdr.Read() )
    {
        // list of csv headers
        var csvFields = csvRdr.FieldHeaders

        // get individual value by field name
        var sourceVal = csvRdr.GetField<string>( "File_Source" );

        // perform your data transformation logic here 
    }   
}

Simply read the header line first:

var fileLinesIterator = File.ReadLines(...);

string headerLine = fileLinesIterator.Take(1);

Then prepend it to every output:

var splitQuery = from line in fileLinesIterator

// ...


    File.WriteAllLines(@"C:\test\" + output.Key + ".csv", headerLine + "\r\n" + output);

But apart from that, you don't want to be handling CSV files as mere (lines of) strings. You're bound to running into running into trouble with quoted and multiline values.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM