简体   繁体   中英

Parsing A Complex, Multiline CSV File in C#

I have searched the site, and found multiple examples for how to accomplish C# parsing using a variety of methods...but have found none that can help me in this specific scenario. I have a complex CSV file that needs parsing. Here is a sampling of some of the header data...

REPORT TITLE,New Query,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
REPORT DESCRIPTION,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
GENERATED,12/20/2019 7:33 AM ET,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Client Name,Client A,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Time Frame,Last Completed Period,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,Calendar year,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Received Date,Custom,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,01/01/2015 - 12/31/2015,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Service Date,Custom,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,01/01/2015 - 12/31/2015,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Adjustments,CHOICE(S),,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,Phone Calibration,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
View,N/A,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
SERVICE LINE,Service Line Example A,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
SITE,General 1,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,General 2,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,General 3,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,General 4,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,General 5,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,General 7,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
FILTER,CHOICE(S),,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Client ID,'00001',,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,'00002',,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,'00003',,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,'00004',,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,'00005',,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,'00006',,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,

There is NOTHING I can do about the formatting of the CSV file, as it is part of a legacy system. The 40 commas placed at the end of each row, as well as those used as row separators, are placed by the system.

Here is where I am with my code so far...

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using Microsoft.VisualBasic.FileIO;
using System.Text.RegularExpressions;


namespace ConsoleUI
{
    class Program
    {
        static void Main(string[] args)
        {
            var sourcePath = @"L:\sourceData.csv";
            var delimiter = ",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,";
            var tempPath = Path.GetTempFileName();
            var lineNumber = 0;

            var splitExpression = new Regex(@"(" + delimiter + @")(,)(?=(?:[^""]|""[^""]*"")*$)");

            using (var writer = new StreamWriter(tempPath))
            using (var reader = new StreamReader(sourcePath))

            {
                string line = null;

                while ((line = reader.ReadLine()) != null)
                {
                    lineNumber++;

                    var rows = splitExpression.Split(line).Where(s => s != delimiter).ToArray();

                    // This is where I need to place the parsed data into objects

                    writer.WriteLine(string.Join(delimiter, rows));
                }

            }
        }
    }
}

Ultimately, I need to move each parsed piece of data into its own defined object. I have that class already built.

ANY help that can be provided would be considered a holiday miracle at this point! Thanks for your time.

try following :

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;

namespace ConsoleApplication1
{
    class Program
    {
        const string FILENAME = @"c:\temp\test.csv";
        static void Main(string[] args)
        {
            StreamReader reader = new StreamReader(FILENAME);
            string line = "";
            Report report = new Report();
            string header = "";
            while ((line = reader.ReadLine()) != null)
            {
                List<string> data = new List<string>();
                string[] row = line.Split(new char[] { ',' }, StringSplitOptions.RemoveEmptyEntries).ToArray();
                if (row.Length > 0)
                {
                    if (line.StartsWith(","))
                    {
                        data = row.ToList();
                    }
                    else
                    {
                        header = row[0];
                        data = row.Skip(1).ToList();
                    }
                    if (data.Count == 0) continue;

                    switch (header)
                    {
                        case "REPORT TITLE":
                            report.title = data[0];
                            break;
                        case "REPORT DESCRIPTION":
                            report.description = data[0];
                            break;
                        case "GENERATED":
                            report.generated = data[0];
                            break;
                        case "Client Name":
                            report.name = data[0];
                            break;
                        case "Time Frame":
                            if (report.timeFrame == null) report.timeFrame = new List<string>();
                            report.timeFrame.AddRange(data);
                            break;
                        case "Received Date":
                            if (report.receivedDate == null) report.receivedDate = new List<string>();
                            report.receivedDate.AddRange(data);
                            break;
                        case "Service Date":
                            if (report.serviceDate == null) report.serviceDate = new List<string>();
                            report.serviceDate.AddRange(data);
                            break;
                        case "Adjustments":
                            if (report.adjustments == null) report.adjustments = new List<string>();
                            report.adjustments.AddRange(data);
                            break;
                        case "View":
                            if (report.view == null) report.view = new List<string>();
                            report.view.AddRange(data);
                            break;
                        case "SERVICE LINE":
                            if (report.serviceLine == null) report.serviceLine = new List<string>();
                            report.serviceLine.AddRange(data);
                            break;
                        case "SITE":
                            if (report.site == null) report.site = new List<string>();
                            report.site.AddRange(data);
                            break;
                        case "FILTER":
                            if (report.filter == null) report.filter = new List<string>();
                            report.filter.AddRange(data);
                            break;
                        case "Client ID":
                            if (report.clientId == null) report.clientId = new List<string>();
                            report.clientId.AddRange(data);
                            break;
                    }
                }

            }
        }
    }
    public class Report
    {
        public string title { get; set; }
        public string description { get; set; }
        public string  generated { get; set; }
        public string name { get; set; }
        public List<string> timeFrame { get; set; }
        public List<string> receivedDate { get; set; }
        public List<string> serviceDate { get; set; }
        public List<string> adjustments { get; set; }
        public List<string> view { get; set; }
        public List<string> serviceLine { get; set; }
        public List<string> site { get; set; }
        public List<string> filter { get; set; }
        public List<string> clientId { get; set; }
    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM