简体   繁体   中英

Is this the most efficient way to search for a substring?

I'm working with some code that returns a code to indicate the type of user that they are (eg "A", "B", "C", "D", etc.). Each code corresponds to a specific role and/or scope (eg across the entire application or just for the object being worked on).

In some code that I'm looking at, I see calls to check if the user's code is one of several in order to allow them to perform some action. So I see calls like:

//"B" would come from the database
string userCode = "B";

//some more work...

//if the user's code is either A or C...
if("AC".IndexOf(userCode) >= 0) {
  //do work that allows the user to progress
} else {
  //notify user they can't do this operation
}

Is this an efficient way of performing this check? Are there more efficient ways?

Thanks in advance!

Looking at the de-compiled code for Contains() , it just calls IndexOf() with StringComparison.Ordinal , so I'd say IndexOf() is most efficient (by a very small hair) i if used in the same way (Ordinal) since it has one less method call, but Contains() is more readable and therefore more maintainable...

public bool Contains(string value)
{
    return (this.IndexOf(value, StringComparison.Ordinal) >= 0);
}

As in all things, I'd go with what's more readable and maintainable then splitting hairs on performance. Only do micro-optimization when you know there's a bottleneck at this point.

UPDATE : Over 1,000,000 iterations:

  • Contains(value) - took 130ms
  • IndexOf(value, StringComparison.Ordinal) - took 128 ms

So as you can see, very, very NEAR same. Once again, go with what's more maintainable.

UPDATE 2 : If your code is always a single char (not a 1-char string), IndexOf() is faster:

  • Contains(char value) - took 94 ms
  • IndexOf(char value) - took 16 ms

If you know your char codes are always a single char, it is about an order of magnitude faster to use IndexOf() with a char argument.

This is because Contains(char value) is an extension method off of IEnumerable<T> and not a first class method of string .

But once again ~100 ms over 1,000,000 iterations is really, truly, quite negligible.

If you're looking for a single character, and it is not case-sensitive, use the overload that works with a char. Searching for a single case-insensitive char is quicker than a sub-string.

"AC".IndexOf('C');

This would have to be ridiculously performance critical to matter though. What you are doing would be extremely fast with any obvious method.

Update - Timings

[Test]
public void Time()
{
    const string UserCode = "C";
    const char UserCodeChar = 'C';
    const int Iterations = 10000000;

    double adjust = 0;

    Func<Action, double> time = action =>
    {
        Stopwatch sw = Stopwatch.StartNew();
        for (int i = 0; i < Iterations; i++) action();
        return sw.Elapsed.TotalMilliseconds;
    };

    Action<string, Action> test = (desc, t) =>
    {
        double ms = time(t) - adjust;
        Console.WriteLine(desc + " time: {0}ms", ms);
    };

    adjust = time(() => { });

    test("IndexOfString", () => "AC".IndexOf(UserCode));
    test("IndexOfString", () => "AC".IndexOf(UserCode));

    test("ContainsString", () => "AC".Contains(UserCode));
    test("ContainsString", () => "AC".Contains(UserCode));

    test("IndexOfChar", () => "AC".IndexOf(UserCodeChar));
    test("IndexOfChar", () => "AC".IndexOf(UserCodeChar));
}

Result:

IndexOfString time: 1035.2984ms
IndexOfString time: 1026.2889ms
ContainsString time: 764.9274ms
ContainsString time: 736.7621ms
IndexOfChar time: 92.9008ms
IndexOfChar time: 92.9961ms

I think you can assume that system libraries are implemented quite efficient and that you are usually not able to speed up things using home made solutions. That said, I think your way of encoding user types is quite strange. Why not use Bitmasks or something like that? Beside that, I would assume that your question is irrelevant at all: CompareD to accessing a database and doing "some work" your check does not matter at all.

Using the Contains() function is one option. I don't know how it performs versus index of, but it's an option:

string userCode = "B";
string someStringToSearchIn = "Current user is: B";

if (someStringToSearchIn.Contains(userCode))
{
    //do something
}

Firstly, I'm not really sure why you're concerned about efficiency here. You are performing an extremely short string search, which is unlikely to bog down performance.

That being said, I actually find the style confusing. If you want to know if userCode is one of "A" and "C", just say so:

if (userCode.Equals("A") || userCode.Equals("C"))
{
    // Do something useful
}

Crystal clear, and most likely performant as well.

As a side note, if you can get these codes into an enum instead, things could be much easier:

[Flags]
public enum UserCode
{
    None = 0,
    A = 1,
    B = 2,
    C = 4
}

Now you can just say:

if ((userCode & (UserCode.A | UserCode.C)) != UserCode.None)
{
    // Do something even more useful
}

At least on my computer, Contains with a string (rather than a char ) is fastest.

Results:
IndexOf time: 616ms
Contains time: 499ms
Contains char time: 707ms

Code:

    static void Main(string[] args)
    {
        string userCode = "B";
        char userCodeChar = 'B';
        int iterations = 10000000;
        var sw = Stopwatch.StartNew();
        for (int i = 0; i < iterations; i++)
        {
            if ("AC".IndexOf(userCode) >= 0)
            {
                int a = 1 + 1;
            }
        }
        sw.Stop();
        Console.WriteLine("IndexOf time: {0}ms", sw.ElapsedMilliseconds);


        sw = Stopwatch.StartNew();
        for (int i = 0; i < iterations; i++)
        {
            if ("AC".Contains(userCode))
            {
                int a = 1 + 1;
            }
        }
        sw.Stop();
        Console.WriteLine("Contains time: {0}ms", sw.ElapsedMilliseconds);



        sw = Stopwatch.StartNew();
        for (int i = 0; i < iterations; i++)
        {
            if ("AC".Contains(userCodeChar))
            {
                int a = 1 + 1;
            }
        }
        sw.Stop();
        Console.WriteLine("Contains char time: {0}ms", sw.ElapsedMilliseconds);

        Console.WriteLine("Done");
        Console.ReadLine();
    }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM