简体   繁体   中英

How to extract specfic data from lines in a text file and output to a new text file using C#

I have several large .csv files that contain lines of data. I need to extract from each line only specific parts of the data, thereby ignoring the parts I am not interested in and output the result into a new text file.

For example, here is a section of the data:

Fr 23:59:59 M40 N04161K RX LAG 2 JNYT  17 STORE OCC 1 PRUD 1 RAW  -9 LAG   0

Fr 23:59:59 M08  N09461M  %SAT   3  %CONG   0  MQ 0  EB 0  OSQ     0 NSQ     4

Fr 23:59:59 M20 N09461M SAT   3%  SQ     0  FLOW     4  GN  13  STOC  9

I am looking to write a new file that looks like this:

5,23,59,59,2,17,1,1,-9,0

5,23,59,59,3,0,0,0,0,4

5,23,59,59,3,0,4,13,9

(You will notice that the start of the data is a '5' which I would also like to use instead of 'Fr' which stands for 'Friday')

The data is identified in datasets by the 'M' reference (M40, M08 etc) and it would be useful to output all the data in its dataset (so for example all data with M40 filtered into one .txt file, hence my 'if' statements)

I would prefer to have each number separated with a comma but not essential

Here is my code so far:

class Program
{
    static void Main(string[] args)
    {
        String line;
        try
        {
            //Pass the file path and file name to the StreamReader constructor
            StreamReader sr = new StreamReader("C:\\MessExport_20110402_0000.csv");
            StreamWriter sw = new StreamWriter("C:\\output.txt");
            //Read the first line of text
            line = sr.ReadLine();

            //Continue to read until you reach end of file
            while (line != null)
            {
                if (line.Contains("M40"))
                {
                    sw.WriteLine(line);
                }
                    if (line.Contains("M08"))
                    {
                        sw.WriteLine(line);
                    }      
                line = sr.ReadLine();
            }

            //close the files
            sr.Close();
            sw.Close();
            //Console.ReadLine();
        }
        catch (Exception e)
        {
            Console.WriteLine("Exception: " + e.Message);
        }
        finally
        {
            Console.WriteLine("Executing finally block.");
            Console.WriteLine("Press any key to exit.");
            Console.ReadKey();
        }


    }
}

It would then be useful to read the next .csv file and again output the results to a new .txt file

I am very new to using any code with regex and split so any help would be much appreciated.

Just a straightforward implementation:

string workingDirectory = @"c:\";

var days = new[] { "Su", "Mo", "Tu", "We", "Th", "Fr", "Sa" };
var writers = new Dictionary<string, StreamWriter>();
using (StreamReader sr = new StreamReader(workingDirectory + "data.csv"))
{
    string line;
    while ((line = sr.ReadLine()) != null)
    {
        var items = line.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);

        StreamWriter w;
        if (!writers.TryGetValue(items[2], out w))
        {
            w = new StreamWriter(workingDirectory + items[2] + ".txt");
            writers.Add(items[2], w);
        }

        var times = items[1].Split(':');
        var digits = items.Skip(3)
                    .Select(x => { int i; return new { IsValid = int.TryParse(x, out i), Value = x }; })
                    .Where(x => x.IsValid).Select(x => x.Value);
        var data = new[] { Array.IndexOf(days, items[0]).ToString() }.Concat(times).Concat(digits);
        w.WriteLine(String.Join(",", data));
    }
}
foreach (var w in writers)
{
    w.Value.Close();
    w.Value.Dispose();
}

This is a quick stab, but I think it will get you part of the way there.

var lines = new List<string> { 
    "Fr 23:59:59 M40 N04161K RX LAG 2 JNYT  17 STORE OCC 1 PRUD 1 RAW  -9 LAG   0",
    "Fr 23:59:59 M08  N09461M  %SAT   3  %CONG   0  MQ 0  EB 0  OSQ     0 NSQ     4",
    "Fr 23:59:59 M20 N09461M SAT   3%  SQ     0  FLOW     4  GN  13  STOC  9"
};
var options = RegexOptions.IgnorePatternWhitespace;
var regex = new Regex("(?: ^\\w\\w | -?\\b\\d+\\b )", options );

foreach (var l in lines ){
    var matches = regex.Matches( l );

    foreach(Match m in matches){
        Console.Write( "{0},", m.Value );
    }
    Console.WriteLine();
}

Produces:

Fr,23,59,59,2,17,1,1,-9,0,
Fr,23,59,59,3,0,0,0,0,4,
Fr,23,59,59,3,0,4,13,9,
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;

namespace Program
{
  public class TransformCsv
  {
    [STAThread]
    public static void Main(String[] args)
    {
      (new TransformCsv()).Run(@"c:\temp\MessExport_20110402_0000.csv", @"c:\temp\output.txt", LineFilterFunction);
    }

    public static Boolean LineFilterFunction(String line)
    {
      return line.Contains("M40") || line.Contains("M08");
    }

    ////////////////////

    private List<String> _dayOfWeek = new List<String>() { "Mo", "Tu", "We", "Th", "Fr", "Sa", "Su" };

    private Dictionary<String, String> _mReference =
      new Dictionary<String, String>()
      {
        // Add other M-reference mappings here.
        { "M40", "2" },
        { "M08", "3" },
        { "M20", "3" }
      };

    public void Run(String inputFilePath, String outputFilePath, Func<String, Boolean> lineFilterFunction)
    {
      using (var reader = new StreamReader(inputFilePath))
      {
        using (var writer = new StreamWriter(outputFilePath))
        {
          String line = null;
          while ((line = reader.ReadLine()) != null)
          {
            if (!String.IsNullOrWhiteSpace(line) && lineFilterFunction(line))
              writer.WriteLine(this.GetTransformedLine(line));
          }
        }
      }
    }

    private static Char[] _spaceCharacter = " ".ToCharArray();

    private String GetTransformedLine(String line)
    {
      var elements = line.Split(_spaceCharacter, StringSplitOptions.RemoveEmptyEntries);

      var result = new List<String>();
      result.Add((_dayOfWeek.IndexOf(elements[0]) + 1).ToString());
      result.Add(elements[1].Replace(':', ','));
      result.Add(_mReference[elements[2]]);
      result.AddRange(elements.Skip(3).Where(e => this.IsInt32(e)));

      return String.Join(",", result);
    }

    private Boolean IsInt32(String s)
    {
      Int32 _;
      return Int32.TryParse(s, out _);
    }
  }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM