简体   繁体   中英

In C#, Loading large file into winform richtextbox

I need to load a - 10MB range text file into a Winform RichTextBox, but my current code is freezing up the UI. I tried making a background worker do the loading, but that doesnt seem to work too well either.

Here's my several loading code which I was tried. Is there any way to improve its performance? Thanks.

    private BackgroundWorker bw1;
    private string[] lines;
    Action showMethod;
    private void button1_Click(object sender, EventArgs e)
    {
        bw1 = new BackgroundWorker();
        bw1.DoWork += new DoWorkEventHandler(bw_DoWork);
        bw1.RunWorkerCompleted += bw_RunWorkerCompleted;
        string path = @"F:\DXHyperlink\Book.txt";
        if (File.Exists(path))
        {
            string readText = File.ReadAllText(path);
            lines = readText.Split(new string[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries);
            bw1.RunWorkerAsync();

        }
    }

    private void bw_DoWork(object sender, DoWorkEventArgs e)
    {
        Invoke((ThreadStart)delegate()
        {
            for (int i = 0; i < lines.Length; i++)
            {
                richEditControl1.Text += lines[i] + "\n";
            }
        });
    }

I Also Try:

Action showMethod = delegate()
            {
                for (int i = 0; i < lines.Length; i++)
            {
                richEditControl1.Text += lines[i] + "\n";
            }
            };

It's about how you're invoking the UI update, check the AppendText bellow.

private BackgroundWorker bw1;

private void button1_Click(object sender, EventArgs e)
{
    bw1 = new BackgroundWorker();
    bw1.DoWork += new DoWorkEventHandler(bw_DoWork);
    bw1.RunWorkerCompleted += bw_RunWorkerCompleted;
    bw1.RunWorkerAsync();
}

private void bw_DoWork(object sender, DoWorkEventArgs e)
{
    string path = @"F:\DXHyperlink\Book.txt";
    if (File.Exists(path))
    {
        string readText = File.ReadAllText(path);
        foreach (string line in readText.Split(new string[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries))
        {
            AppendText(line);
            Thread.Sleep(500);
        }
    }
}

private void AppendText(string line)
{
    if (richTextBox1.InvokeRequired)
    {
        richTextBox1.Invoke((ThreadStart)(() => AppendText(line)));
    }
    else
    {
        richTextBox1.AppendText(line + Environment.NewLine);
    }
}

In addition to that reading the whole file text is very inefficient. I would rather read chunk by chunk and update the UI. ie

private void bw_DoWork(object sender, DoWorkEventArgs e)
{
    string path = @"F:\DXHyperlink\Book.txt";
    const int chunkSize = 1024;
    using (var file = File.OpenRead(path))
    {
        var buffer = new byte[chunkSize];
        while ((file.Read(buffer, 0, buffer.Length)) > 0)
        {
            string stringData = System.Text.Encoding.UTF8.GetString(buffer);

            AppendText(string.Join(Environment.NewLine, stringData.Split(new[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries)));
        }
    }
}

You don't want to concatenate strings in a loop.

A System.String object is immutable. When two strings are concatenated, a new String object is created. Iterative string concatenation creates multiple strings that are un-referenced and must be garbage collected. For better performance, use the System.Text.StringBuilder class.

The following code is very inefficient:

for (int i = 0; i < lines.Length; i++)
{
    richEditControl1.Text += lines[i] + "\n";
}

Try instead:

private void bw_DoWork(object sender, DoWorkEventArgs e)
{
    // Cpu intensive work happens in the background thread.
    var lines = string.Join("\r\n", lines);

    // The following code is invoked in the UI thread and it only assigns the result.
    // So that the UI is not blocked for long.
    Invoke((ThreadStart)delegate()
    {
        richEditControl1.Text = lines;
    });
}

why do you want to split lines and join them again?

strings are immutable witch means cant be changed. So every time you do Text+= "..." it has to create new string and put it in Text. So for 10 mb string its not ideal way and it may take Centuries To complete such task for Huge strings.

You can see What is the difference between a mutable and immutable string in C#?

If you really want to split Them and Join them again. then StringBuilder is the right option for you.

        StringBuilder strb = new StringBuilder();

        for (int i = 0; i < lines.Length; i++)
        {
            strb.Append(lines[i] + "\n");
        }

        richEditControl1.Text = strb.ToString();

You can see String vs. StringBuilder

Structure of StringBuilder is List of characters. Also StringBuilder is Muttable. means can be changed.

Inside Loop you can do any extra task with string and Add the final result to the StringBuilder. Finally after the loop Your StringBuilder is Ready. You have to convert it to string and Put it in Text.

It took me a while to nail this one..

Test one & two:

First I created some clean data:

string l10 = " 123456789";
string l100 = l10 + l10 + l10 + l10 + l10 + l10 + l10 + l10 +l10 + l10;
string big = "";
StringBuilder sb = new StringBuilder(10001000);
for (int i = 1; i <= 100000; i++)
    // this takes 3 seconds to load
    sb.AppendLine(i.ToString("Line 000,000,000 ") + l100 + " www-stackexchange-com "); 
     // this takes 45 seconds to load !!
    //sb.AppendLine(i.ToString("Line 000,000,000 ") + l100 + " www.stackexchange.com "); 
big = sb.ToString();

Console.WriteLine("\r\nStringLength: " + big.Length.ToString("###,###,##0") + "  ");

richTextBox1.WordWrap = false;
richTextBox1.Font = new System.Drawing.Font("Consolas", 8f);
richTextBox1.AppendText(big);
Console.WriteLine(richTextBox1.Text.Length.ToString("###,###,##0") + " chars in RTB");
Console.WriteLine(richTextBox1.Lines.Length.ToString("###,###,##0") + " lines in RTB ");

Displaying 100k lines totalling in around 14MB takes either 2-3 seconds or 45-50 secods.

Cranking the line count up to 500k lines brings up the normal text load time to around 15-20 seconds and the version that includes a (valid) link at the end of each line to several minutes.

When I go to 1M lines the loading crashes VS.

Conclusions :

  • It takes a 10+ times longer to load a text with links in it and during that time the UI is freezing.

  • Loading 10-15MB of textual data is no real problem as such.

Test three:

string bigFile = File.ReadAllText("D:\\AllDVDFiles.txt");
richTextBox1.AppendText(bigFile);

(This was actually the start of my investigation..) This tries to load a 8 MB large file, containing directory and file info from a large number of data DVDs. And: It freezes , too.

As we have seen the file size is not the reson. Nor are there any links embedded.

From the first looks of it the reason are funny characters in some of the filenames.. After saving the file to UTF8 and changing the read command to..:

string bigFile = File.ReadAllText("D:\\AllDVDFiles.txt", Encoding.UTF8);

..the file loads just fine in 1-2 seconds, as expected..

Final conclusions:

  • You need to watch out for wrong encoding as those characters can freeze the RTB during loading.
  • And, when you add links you must expect for the loading to take a a lot (10-20x) longer than the pure text. I have tried to trick the RTB by preparing a Rtf string but it didn't help. Seems that analyzing and storing all those links will always take such a long time.

So: If you really need a link on every line, do partition the data into smaller parts and give the user an interface to scroll & search through those parts.

Of course appending all those lines one by one will always be way too slow , but this has been mentionend in the comments and other answers already.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM