简体   繁体   中英

Read Csv using LINQ

I am having a csv file like this

A, 22, 23, 12
B, 32, 4, 33
C, 34, 3 ,33

I want to print the sum and average of each row and skip the first column. How to do in LINQ using Lambda

var stuff = from l in File.ReadAllLines(filename)
            let x = l.Split(new [] {',', ' '}, StringSplitOptions.RemoveEmptyEntries)
                     .Skip(1)
                     .Select(s => int.Parse(s))
            select new
            {
                Sum = x.Sum(),
                Average = x.Average()
            };

If you're reading big files and memory use is a concern, then the following will work better using .NET 4:

var stuff = from l in File.ReadLines(filename)
            let x = l.Split(new [] {',', ' '}, StringSplitOptions.RemoveEmptyEntries)
                     .Skip(1)
                     .Select(s => int.Parse(s))
            select new
            {
                Sum = x.Sum(),
                Average = x.Average()
            };

In both cases, the stuff variable contains an enumerable which won't actually be executed until you start reading from it (eg inside a foreach loop).

        string csvFile = @"myfile.csv";
        string[] lines = File.ReadAllLines(csvFile);

        var values = lines.Select(l => new { FirstColumn = l.Split(',').First(), Values = l.Split(',').Skip(1).Select(v => int.Parse(v)) });
        foreach (var value in values)
        {
            Console.WriteLine(string.Format("Column '{0}', Sum: {1}, Average {2}", value.FirstColumn, value.Values.Sum(), value.Values.Average()));
        }

Try to use this old but still good library: FileHelpers Library

It's very easy to use:

char delimiter = ',';
var dt = FileHelpers.CsvEngine.CsvToDataTable(fileName,delimiter);

then just do:

var rowStats = dt.AsEnumerable()
                 .Select(x => x.ItemArray.Select(y => Convert.ToInt32(y)))
                 .Select(x => new { avg = x.Average(), sum = x.Sum() });

foreach (var rowStat in rowStats)
{
    Console.WriteLine("Sum: {0}, Avg: {1}", rowStat.sum, rowStat.avg);
}
string[] csvlines = File.ReadAllLines(@txtCSVFile.Text);

var query = from csvline in csvlines
  let data = csvline.Split(',')
  select new
  {
   ID = data[0],
   FirstNumber = data[1],
   SecondNumber = data[2],
   ThirdNumber = data[3]
  };

I just have discovered LinqToCsv library, it do all the parsing stuff and then you can query objects like collections and it supports deferred reading:

http://www.codeproject.com/Articles/25133/LINQ-to-CSV-library

Hi you are looking for something like this

  var rows = new List<string> {"A, 22, 23, 12", "B, 32, 4, 33", "C, 34, 3 ,33"};
     foreach (var row in rows) {
            var sum = row.Split(',').Skip(1).Sum(x => Convert.ToInt32(x));
            var avg = row.Split(',').Skip(1).Average(x => Convert.ToInt32(x));
     }

Something like this maybe:

var csv = @"A, 22, 23, 12
B, 32, 4, 33
C, 34, 3 ,33";

var lines =
    csv.Split('\n').Select(x => x.Split(',').Skip(1).Select(n => int.Parse(n))).Select(x => new {Sum = x.Sum(), Average = x.Average()});
foreach (var line in lines)
{
    Console.WriteLine("Sum: " + line.Sum);
    Console.WriteLine("Average: " + line.Average);
}

In general, I don't suggest to do something like this. You should use a full blown CSV reader to parse the CSV file and you should include error handling.

using System.IO

// turn file into IEnumerable (streaming works better for larger files)
IEnumerable<Tuple<int, int, int>> GetTypedEnumerator(string FilePath){
  var File = File.OpenText(FilePath);
  while(!File.EndOfStream) 
      yield return new Tuple<int, int, int>(
          Int.Parse(File[1]), 
          Int.Parse(File[2], 
          Int.Parse(File[3])
      );
   File.Close();
}

// this lines would return the sum and avg for each line
var tot = GetTypeEnumerator(@"C:\file.csv").Select(l=>l.Item1 + l.Item2 + l.Item3);
var avg = GetTypeEnumerator(@"C:\file.csv").Select(l=> (l.Item1 + l.Item2 + l.Item3) / 3);

The streaming aporoach will let you handle laregr files because you wouldn;t need toload them into memeory first. Don't have VS here, haven't checked the syntax, might not compile as is.

Regards GJ

Damn, lot of answers already, need to type faster!

Actually for most cases you should avoid splitting based on ',' only because you could have coma in string.

I give you a better generic solution using Regex and easy to use:

var stuff = File.ReadAllLines(csvFilePath)
    .Skip(1) // For header
    .Select(s => Regex.Match(s, @"^(.*?),(.*?),(.*?),(.*?),(.*?)$"))
    .Select(data => new 
    {
        Foo = data.Groups[1].Value,
        Bar = data.Groups[2].Value,
        One = data.Groups[3].Value,
        Two = data.Groups[4].Value,
    });

And you can find more details here https://stackoverflow.com/a/18147076/196526

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM