简体   繁体   中英

How to filter CSV data using C#

I have a csv file containing string data, I would like to filter out all data rows if the "Job" column is empty, create a new csv file and write the remaining data rows into it.

The string data contains double quote "", The "Id", "Name", "Job" in the first line are actually string data, not column, which means that if the third string data of a data row is double quote "" , it's filtered out.

"Id" , "Name"    , "Job"
"1"  , "Alan"    , "Engineer"
"2"  , "Bob"     , "Technician"
"3"  , "Charlie" , ""
"4"  , "Danny"   , ""

The remaining data according to the csv above are expected to be

"Id" , "Name"    , "Job"
"1"  , "Alan"    , "Engineer"
"2"  , "Bob"     , "Technician"

You can do this using something like this:

string[] lines = File.ReadAllLines("csvFile.csv");
List<string[]> csvData = new List<string[]>();
foreach(string line in lines)
{
    using (TextFieldParser parser = new TextFieldParser(new StringReader(line)))
    {
        parser.TextFieldType = FieldType.Delimited;
        parser.SetDelimiters(",");
        parser.HasFieldsEnclosedInQuotes = true;
        parser.TrimWhiteSpace = true;
        while (!parser.EndOfData)
            csvData.Add(parser.ReadFields());
    }
}

//Select all lines in csv file in which third column are not empty
List<string[]> filteredCsvData = csvData.Where(x => !string.IsNullOrWhiteSpace(x[2])).ToList();

StringBuilder builder = new StringBuilder();
foreach (string[] line in filteredCsvData)
{
    //Quote all columns back
    string[] quotedLine = Array.ConvertAll(line, x => '"' + x + '"');
    builder.AppendLine(string.Join(',', quotedLine));
}
File.WriteAllText ("newCsvFile.csv", builder.ToString());

If you need to quote only those columns that have commas use: string[] quotedLine = Array.ConvertAll(line, x => '"' + x + '"');

NOTE following code uses Microsoft.VisualBasic.FileIO.TextFieldParser as CSV parser to use it in .Net Framework you need to include Microsoft.VisualBasic to your project (also available in .Net Core 3.0)

Step1: Read the CSV and make a List object.

With the Poco class :

public class Foo
{
    public string Id { get; set; }
    public string Name { get; set; }        
    public string Job { get; set; }
}

You can use CSV helper to read with the following configuration:

Configuration.Delimiter=",";
Configuration.HasHeaderRecord=false; // has your first line look like header but is not
Configuration.TrimOptions= TrimOptions.Trim | TrimOptions.InsideQuotes;
Configuration.RegisterClassMap<FooMap>();

Step2: Filter the result

records.Where(x=> !string.IsNullOrEmpty(x.Job))

Complete code exemple, and live demo :

        var input = @"""Id"" , ""Name""    , ""Job""
""1""  , ""Alan""    , ""Engineer""
""2""  , ""Bob""     , ""Technician""
""3""  , ""Charlie"" , """"
""4""  , ""Danny""   , """"";

        var records= new List<Foo>();

        //reading CSV to List Foo;
        using (TextReader reader = new StringReader(input))
        using (var csvReader = new CsvReader(reader))
        {
            csvReader.Configuration.Delimiter=",";
            csvReader.Configuration.HasHeaderRecord=false; 
            csvReader.Configuration.TrimOptions=TrimOptions.Trim | TrimOptions.InsideQuotes;
            csvReader.Configuration.RegisterClassMap<FooMap>();
            records = csvReader.GetRecords<Foo>().ToList();
        }

        records.Dump();

        //Filter
        var result = records.Where(x=> !string.IsNullOrEmpty(x.Job)).ToList();
        result.Dump();
    }

    public class Foo
    {
        public string Id { get; set; }
        public string Name { get; set; }        
        public string Job { get; set; }
    }
    public class FooMap : ClassMap<Foo>
    {
        public FooMap()
        {
            Map(m => m.Id).Index(0);
            Map(m => m.Name).Index(1);          
            Map(m => m.Job).Index(2);

            //mapping with column header
            /*Map(m => m.Id).Name("Id");
            Map(m => m.Name).Name("Name");          
            Map(m => m.Job).Name("Job");*/
        }
    }

If you have csv helper tools,you can use tools parse file to List or DataTable,filter data and export.

If you don't have,you can try:

using (var reader = new StreamReader(sourcePath))
            {
                using (var writer = new StreamWriter(destiPath))
                {
                    String line;
                    while ((line = reader.ReadLine()) != null)
                    {
                        var list = line.Split(',');
                        if (list.Length > 2 && !string.IsNullOrEmpty(list[2].Trim(' ','\"')))
                        {
                            writer.WriteLine(line);
                        }
                    }
                }
            }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM