简体   繁体   中英

is String.Contains() faster than walking through whole array of char in string?

I have a function that is walking through the string looking for pattern and changing parts of it. I could optimize it by inserting

if (!text.Contains(pattern)) return;

But, I am actually walking through the whole string and comparing parts of it with the pattern, so the question is, how String.Contains() actually works? I know there was such a question - How does String.Contains work? but answer is rather unclear. So, if String.Contains() walks through the whole array of chars as well and compare them to pattern I am looking for as well, it wouldn't really make my function faster, but slower.

So, is it a good idea to attempt such an optimizations? And - is it possible for String.Contains() to be even faster than function that just walk through the whole array and compare every single character with some constant one?

Here is the code:

    public static char colorchar = (char)3;

    public static Client.RichTBox.ContentText color(string text, Client.RichTBox SBAB)
    {
        if (text.Contains(colorchar.ToString()))
        {
            int color = 0;
            bool closed = false;
            int position = 0;
            while (text.Length > position)
            {
                if (text[position] == colorchar)
                {
                    if (closed)
                    {
                        text = text.Substring(position, text.Length - position);
                        Client.RichTBox.ContentText Link = new Client.RichTBox.ContentText(ProtocolIrc.decode_text(text), SBAB, Configuration.CurrentSkin.mrcl[color]);
                        return Link;
                    }

                    if (!closed)
                    {
                        if (!int.TryParse(text[position + 1].ToString() + text[position + 2].ToString(), out color))
                        {
                            if (!int.TryParse(text[position + 1].ToString(), out color))
                            {
                                color = 0;
                            }
                        }
                        if (color > 9)
                        {
                            text = text.Remove(position, 3);
                        }
                        else
                        {
                            text = text.Remove(position, 2);
                        }
                        closed = true;
                        if (color < 16)
                        {
                            text = text.Substring(position);
                            break;
                        }
                    }
                }
                position++;
            }
        }
        return null;
    }

Short answer is that your optimization is no optimization at all.
Basically, String.Contains(...) just returns String.IndexOf(..) >= 0
You could improve your alogrithm to:

int position = text.IndexOf(colorchar.ToString()...);
if (-1 < position)
{  /* Do it */ }

Yes.

And doesn't have a bug (ahhm...).

There are better ways of looking for multiple substrings in very long texts, but for most common usages String.Contains (or IndexOf) is the best.

Also IIRC the source of String.Contains is available in the .Net shared sources

Oh, and if you want a performance comparison you can just measure for your exact use-case

Check this similar post How does string.contains work

I think that you will not be able to simply do anything faster than String.Contains, unless you want to use standard CRT function wcsstr, available in msvcrt.dll, which is not so easy

Unless you have profiled your application and determined that the line with String.Contains is a bottle-neck, you should not do any such premature optimizations. It is way more important to keep your code's intention clear.

Ans while there are many ways to implement the methods in the .NET base classes, you should assume the default implementations are optimal enough for most people's use cases. For example, any (future) implementation of .NET might use the x86-specific instructions for string comparisons. That would then always be faster than what you can do in C#.

If you really want to be sure whether your custom string comparison code is faster than String.Contains , you need to measure them both using many iterations, each with a different string. For example using the Stopwatch class to measure the time.

如果你现在可以用于优化的细节(不仅仅是简单的包含检查)确定你的方法比string.Contains更快,否则 - 不是。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM