Get Index of First non-Whitespace Character in C# String

Question

Is there a means to get the index of the first non-whitespace character in a string (or more generally, the index of the first character matching a condition) in C# without writing my own looping code?

EDIT

By "writing my own looping code", I really meant that I'm looking for a compact expression that solves the problem without cluttering the logic I'm working on.

I apologize for any confusion on that point.

Answer 1

string当然是IEnumerable<char>所以你可以使用Linq：

int offset = someString.TakeWhile(c => char.IsWhiteSpace(c)).Count();

Answer 2

I like to define my own extension method for returning the index of the first element that satisfies a custom predicate in a sequence.

/// <summary>
/// Returns the index of the first element in the sequence 
/// that satisfies a condition.
/// </summary>
/// <typeparam name="TSource">
/// The type of the elements of <paramref name="source"/>.
/// </typeparam>
/// <param name="source">
/// An <see cref="IEnumerable{T}"/> that contains
/// the elements to apply the predicate to.
/// </param>
/// <param name="predicate">
/// A function to test each element for a condition.
/// </param>
/// <returns>
/// The zero-based index position of the first element of <paramref name="source"/>
/// for which <paramref name="predicate"/> returns <see langword="true"/>;
/// or -1 if <paramref name="source"/> is empty
/// or no element satisfies the condition.
/// </returns>
public static int IndexOf<TSource>(this IEnumerable<TSource> source, 
    Func<TSource, bool> predicate)
{
    int i = 0;

    foreach (TSource element in source)
    {
        if (predicate(element))
            return i;

        i++;
    }

    return -1;
}

You could then use LINQ to address your original problem:

string str = "   Hello World";
int i = str.IndexOf<char>(c => !char.IsWhiteSpace(c));

Answer 3

string s= "   \t  Test";
Array.FindIndex(s.ToCharArray(), x => !char.IsWhiteSpace(x));

returns 6

To add a condition just do ...

Array.FindIndex(s.ToCharArray(), x => !char.IsWhiteSpace(x) && your condition);

Answer 4

You can use the String.IndexOfAny function which returns the first occurrence any character in a specified array of Unicode characters.

Alternatively, you can use the String.TrimStart function which remove all white space characters from the beginning of the string. The index of the first non-white space character is the difference between the length of the original string and the trimmed one.

You can even pick a set of characters to trim :)

Basically, if you are looking for a limited set of chars (let's say digits) you should go with the first method.

If you are trying to ignore a limited set of characters (like white spaces) you should go with the second method.

A Last method would be to use the Linq methods:

string s = "        qsdmlkqmlsdkm";
Console.WriteLine(s.TrimStart());
Console.WriteLine(s.Length - s.TrimStart().Length);
Console.WriteLine(s.FirstOrDefault(c => !Char.IsWhiteSpace(c)));
Console.WriteLine(s.IndexOf(s.FirstOrDefault(c => !Char.IsWhiteSpace(c))));

Output:

qsdmlkqmlsdkm
8
q
8

Answer 5

var match = Regex.Match(" \t test  ", @"\S"); // \S means all characters that are not whitespace
if (match.Success)
{
    int index = match.Index;
    //do something with index
}
else
{
    //there were no non-whitespace characters, handle appropriately
}

If you'll be doing this often, for performance reasons you should cache the compiled Regex for this pattern, eg:

static readonly Regex nonWhitespace = new Regex(@"\S");

Then use it like:

nonWhitespace.Match(" \t test  ");

Answer 6

Since there were several solutions here I decided to do some performance tests to see how each performs. Decided to share these results for those interested...

    int iterations = 1000000;
    int result = 0;
    string s= "   \t  Test";

    System.Diagnostics.Stopwatch watch = new Stopwatch();

    // Convert to char array and use FindIndex
    watch.Start();
    for (int i = 0; i < iterations; i++)
        result = Array.FindIndex(s.ToCharArray(), x => !char.IsWhiteSpace(x)); 
    watch.Stop();
    Console.WriteLine("Convert to char array and use FindIndex: " + watch.ElapsedMilliseconds);

    // Trim spaces and get index of first character
    watch.Restart();
    for (int i = 0; i < iterations; i++)
        result = s.IndexOf(s.TrimStart().Substring(0,1));
    watch.Stop();
    Console.WriteLine("Trim spaces and get index of first character: " + watch.ElapsedMilliseconds);

    // Use extension method
    watch.Restart();
    for (int i = 0; i < iterations; i++)
        result = s.IndexOf<char>(c => !char.IsWhiteSpace(c));
    watch.Stop();
    Console.WriteLine("Use extension method: " + watch.ElapsedMilliseconds);

    // Loop
    watch.Restart();
    for (int i = 0; i < iterations; i++)
    {   
        result = 0;
        foreach (char c in s)
        {
            if (!char.IsWhiteSpace(c))
                break;
            result++;
        }
    }
    watch.Stop();
    Console.WriteLine("Loop: " + watch.ElapsedMilliseconds);

Results are in milliseconds....

Where s = " \\t Test"
Convert to char array and use FindIndex: 154
Trim spaces and get index of first character: 189
Use extension method: 234
Loop: 146

Where s = "Test"
Convert to char array and use FindIndex: 39
Trim spaces and get index of first character: 155
Use extension method: 57
Loop: 15

Where s = (1000 character string with no spaces)
Convert to char array and use FindIndex: 506
Trim spaces and get index of first character: 534
Use extension method: 51
Loop: 15

Where s = (1000 character string that starts with " \\t Test")
Convert to char array and use FindIndex: 609
Trim spaces and get index of first character: 1103
Use extension method: 226
Loop: 146

Draw your own conclusions but my conclusion is to use whichever one you like best because the performance differences is insignificant in real world scenerios.

Answer 7

Inspired by this solution of trimming the string , but much more efficient by using ReadOnlySpan :

string s = "   xyz";
int index = s.Length - s.AsSpan().TrimStart().Length;
// index is 3

Neither .AsSpan() nor .TrimStart() create copies of the string, they just store a reference to a string character and a length.

.AsSpan() is an extension method of String that creates a span pointing to the first character of the string. Its length is the total string length.
.TrimStart() is an extension method of ReadOnlySpan<char> that creates a span pointing to the first non-whitespace character. Its length is the total string length minus the position of the first non-whitespace character.

This pattern can be used in general to skip over any list of given characters:

string s = "foobar";
int index = s.Length - s.AsSpan().TrimStart("fo").Length;
// index is 3

I did a benchmark of this method and several others from this Q&A, using BenchmarkDotNet (my benchmark code ):

Method	Mean	Error	StdDev
Regex_Compiled	45.05 us	0.043 us	0.034 us
ReadOnlySpan_Trim (this answer)	50.24 us	0.073 us	0.061 us
String_Trim	94.64 us	0.458 us	0.428 us
Regex_Interpreted	114.41 us	0.224 us	0.210 us
Regex_StaticMethod (read below!)	114.19 us	0.056 us	0.046 us
FirstNonMatch	150.58 us	0.214 us	0.190 us
Array_FindIndex	200.40 us	1.951 us	1.730 us
StringExt_IndexOfPredicate	336.31 us	0.896 us	0.838 us
Linq_TakeWhile	490.97 us	0.994 us	0.930 us

I didn't expect that RegEx_Compiled would be fastest. Actually RegEx_StaticMethod should perform equally as RegEx_Compiled (because the static Regex methods cache compiled patterns), but as BenchmarkDotNet creates a new process per test run , that cache doesn't have any effect.

The String_Trim benchmark depends on how many characters follow after the first non-whitespace character, because it copies the substring. For short texts, performance could be close to ReadOnlySpan_Trim , but for longer texts performance will be much worse. The input text of this benchmark contains 50k non-whitespace characters, so there is already a significant difference.

Answer 8

您可以修剪，获取第一个字符并使用IndexOf。

Answer 9

There is a very simple solution

string test = "    hello world";
int pos = test.ToList<char>().FindIndex(x => char.IsWhiteSpace(x) == false);

pos will be 4

you can have more complex conditions like:

pos = test.ToList<char>().FindIndex((x) =>
                {
                    if (x == 's') //Your complex conditions go here
                        return true;
                    else 
                        return false;
                }
            );

Answer 10

Yes you can try this:

string stg = "   xyz";
int indx = (stg.Length - stg.Trim().Length);

Answer 11

Something is going to be looping somewhere. For full control over what is and isn't whitespace you could use linq to objects to do your loop:

int index = Array.FindIndex(
               s.ToCharArray(), 
               x => !(new [] { '\t', '\r', '\n', ' '}.Any(c => c == x)));

Answer 12

There are a lot of solutions here that convert the string to an array. That is not necessary, individual characters in a string can be accessed just as items in an array.

This is my solution that should be very efficient:

private static int FirstNonMatch(string s, Func<char, bool> predicate, int startPosition = 0)
{
    for (var i = startPosition; i < s.Length; i++)
        if (!predicate(s[i])) return i;

    return -1;
}

private static int LastNonMatch(string s, Func<char, bool> predicate, int startPosition)
{
    for (var i = startPosition; i >= 0; i--)
        if (!predicate(s[i])) return i;

    return -1;
}

And to use these, do the following:

var x = FirstNonMatch(" asdf ", char.IsWhiteSpace);
var y = LastNonMatch(" asdf ", char.IsWhiteSpace, " asdf ".Length);

Get Index of First non-Whitespace Character in C# String

Question

12 answers

solution1
36 2012-10-02 17:57:16

solution2
13 ACCPTED 2012-10-02 17:56:15

solution3
4 2012-10-02 18:08:43

solution4
3 2012-10-02 17:51:52

solution5
3 2012-10-02 17:53:13

solution6
3 2012-10-02 20:08:54

solution7
3 2022-05-26 14:37:23

Method

Mean

Error

StdDev

solution8
1 2012-10-02 17:52:49

solution9
1 2016-02-19 10:12:41

solution10
0 2012-10-02 18:00:14

solution11
0 2012-10-02 18:05:40

solution12
0 2016-04-12 11:42:06

Get Index of First non-Whitespace Character in C# String

Question

12 answers

solution1 36 2012-10-02 17:57:16

solution2 13 ACCPTED 2012-10-02 17:56:15

solution3 4 2012-10-02 18:08:43

solution4 3 2012-10-02 17:51:52

solution5 3 2012-10-02 17:53:13

solution6 3 2012-10-02 20:08:54

solution7 3 2022-05-26 14:37:23

Method

Mean

Error

StdDev

solution8 1 2012-10-02 17:52:49

solution9 1 2016-02-19 10:12:41

solution10 0 2012-10-02 18:00:14

solution11 0 2012-10-02 18:05:40

solution12 0 2016-04-12 11:42:06

solution1
36 2012-10-02 17:57:16

solution2
13 ACCPTED 2012-10-02 17:56:15

solution3
4 2012-10-02 18:08:43

solution4
3 2012-10-02 17:51:52

solution5
3 2012-10-02 17:53:13

solution6
3 2012-10-02 20:08:54

solution7
3 2022-05-26 14:37:23

solution8
1 2012-10-02 17:52:49

solution9
1 2016-02-19 10:12:41

solution10
0 2012-10-02 18:00:14

solution11
0 2012-10-02 18:05:40

solution12
0 2016-04-12 11:42:06