简体   繁体   中英

How to obtain certain lines from a text file in c#?

I'm working in C# and i got a large text file (75MB) I want to save lines that match a regular expression

I tried reading the file with a streamreader and ReadToEnd, but it takes 400MB of ram

and when used again creates an out of memory exception.

I then tried using File.ReadAllLines():

string[] lines = File.ReadAllLines("file");

StringBuilder specialLines = new StringBuilder();


foreach (string line in lines)

 if (match reg exp)

  specialLines.append(line);

this is all great but when my function ends the memory taken doesnt clear and I'm left with 300MB of used memory, only when recalling the function and executing the line: string[] lines = File.ReadAllLines("file"); I see the memory clearing down to 50MB give or take and then reallocating back to 200MB

How can I clear this memory or get the lines I need in a different way ?

        var file = File.OpenRead("myfile.txt");
        var reader = new StreamReader(file);
        while (!reader.EndOfStream)
        {
            string line = reader.ReadLine();
            //evaluate the line here.
        }
        reader.Dispose();
        file.Dispose();

You need to stream the text instead of loading the whole file in memory. Here's a way to do it, using an extension method and Linq:

static class ExtensionMethods
{
    public static IEnumerable<string> EnumerateLines(this TextReader reader)
    {
        string line;
        while((line = reader.ReadLine()) != null)
        {
            yield return line;
        }
    }
}

...

var regex = new Regex(..., RegexOptions.Compiled);
using (var reader = new StreamReader(fileName))
{
    var specialLines =
        reader.EnumerateLines()
              .Where(line => regex.IsMatch(line))
              .Aggregate(new StringBuilder(),
                         (sb, line) => sb.AppendLine(line));
}

您可以使用StreamReader#ReadLine逐行读取文件并保存所需的那些行。

您应该使用Enumerator模式来保持较低的内存占用,以防您的文件很大。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM