简体   繁体   中英

Find all newlines in richtextbox

I am working on a custom texteditor control and encounterd this problem.

I need a function that gets the character indexes for every newline "\\n" in the text. I allready have two ways to acomplish this:

private List<int> GetNewLineLocations()
    {
        var list = new List<int>();
        int ix = 0;
        foreach (var c in this.Text)
        {
            if (c == '\n') list.Add(ix);
            ix++;
        }
        Debug.WriteLine(ix);
        return list;
    }

And:

private List<int> GetNewLineLocations()
    {
        var list = new List<int>();
        int ix = -1;

        for (int i = 0; i < this.Lines.Length; i++)
        {
            ix += Lines[i].Length;
            ix += 1;
            list.Add(ix);
        }

        return list;
    }

The first solution does work but slows down the more text is entered in the richtextbox that is around 40000 characters but that can be spread out among a lot of rows like 20000.

The second one seems to be faster becaus it loops less and does more or less the same but is slows down dramaticaly at 1000 rows no mater how much text they contain.

The code of course needs to run fast and not use a lot of resources that is why i thought the second solution would be better.

My question is:

  1. Wich solution is better and why?

  2. Why is the second solution so much slowwer?

  3. Is there an even better solution?

I tried both of your examples and Felix's and a solution of my own using a rich text box and 40k lines. The result was this was the fastest, and I saw no slow down. Can you try passing the array of lines as a paramater and let us know the result?

public static List<int> GetNewLineLocations(this string[] lines)
        {
            var list = new List<int>();
            int ix = -1;

            for (int i = 0; i < lines.Length; i++)
            {
                ix += lines[i].Length+1;
                list.Add(ix);
            }

            return list;
        }

When working with strings Regular Expressions are very nice to use. But they are not the fastest. If you need faster processing you should do it on lower levels and in parallel. And make sure to use long as index because int only allow you to process up to 2^31 chars, and long up to 2^63 chars.

I agree with @Nyerguds who sayed in the comments:

The problem is that the standard function to fetch the text in a rich text box is actually a processing function that has to filter out the RTF markup. The actual function to fetch the text is the bottleneck, not what comes after it.

So your data should be held somewhere in the code and not in the userinterface. Sooner or later when processing long texts that will cause trouble anyway, like stuttering when scrolling or further bottlenecks. And I would only represent the lines that could be displayed in the control anyway. So you should overthink your application design. Check your Front/Backend seperation. Storing your data in a backend will allow you to access your data directly without depending in your Textbox methods or other userinterface stuff.

Here is a sample how to easy process data with the Parallel Class of the .net framework:

    using System;
    using System.Collections.Generic;
    using System.Text;
    using System.Threading.Tasks;

    namespace ConsoleApp1
    {
        internal class Program
        {
            public static byte[] _globalDataStore { get; set; }
            private static void Main(string[] args)
            {
                DoStuff();
                ShowDone();
            }

            private static void ShowDone()
            {
                Console.WriteLine("done...");
                Console.ReadKey();
            }

            private static void DoStuff()
            {
                var tempData = GetData();
                StoreData(ref tempData);
                tempData = null; //free some ram
                var dataIdentifier = (byte)'\n';
                GetAndPromptDataPositions(_globalDataStore, dataIdentifier);
            }

            private static void GetAndPromptDataPositions<T>(T[] data, T dataIdentifier)
            {
                var dataPositionList = GetDataPositions<T>(data, dataIdentifier);
                PromptDataPostions(dataPositionList);
            }

            private static void PromptDataPostions(IEnumerable<long> positionList)
            {
                foreach (var position in positionList)
                {
                    Console.WriteLine($"Position '{position}'");
                }
            }
            private static string GetData()
            {
                return "aasdlj\naksdlkajsdlkasldj\nasld\njkalskdjasldjlasd";
            }

            private static void StoreData(ref string tempData)
            {
                _globalDataStore = Encoding.ASCII.GetBytes(tempData);
            }

            private static List<long> GetDataPositions<T>(T[] data, T dataToFind)
            {
                lock (data) //prevent data from being changed while processing, important when have other threaded could write data 
                {
                    var postitonList = new List<long>();
                    Parallel.For(0, data.LongLength, (position) =>
                    {
                        if (data[position].Equals(dataToFind))
                        {
                            lock (postitonList) //lock list because of multithreaded access to prevent data corruption
                            {
                                postitonList.Add(position);
                            }
                        }
                    });
                    return postitonList;
                }
            }
        }
    }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM