简体   繁体   中英

Cleaning up and extracting data from text files

I need to extract data from non delimited text files using C#. Basically, I need to remove all unwanted character then mark the end of a line and add a line break. Once the data has been separated into individual lines I need to loop through each line in turn and extract values using Regular Expressions. I have been doing this with Perl but now need to do it using C#. The raw file contains numerous line break characters throughout the file not jut at the end of a line as you would expect. I will be able to extract values using Regex objects but I am having trouble getting the file into a format that has each record on a line of its own.

You provided scarce information but. This code will create you List of lines.

Note that ReadLine will take a sequence of characters followed by a line feed ("\\n"), a carriage return ("\\r") or a carriage return immediately followed by a line feed ("\\r\\n").
I am not sure if this is the behaviour you expect.

    string fileName = "Text.txt";
    List<string> lines = new List<string>();
    using (StreamReader r = new StreamReader(fileName))
    {
        string line;
        while ((line = r.ReadLine()) != null)
        {
            lines.Add(line);
        }
    }

    foreach (string s in lines)
    {
        Console.WriteLine(s);
       //can do your Regex here
    }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM