简体   繁体   中英

String.Split returns weird result

I noticed that if I Split a string by white-space that contains only white-spaces, it returns unexpected result.Consider this:

var spaces = string.Join("",Enumerable.Repeat(" ", 10));
int lenght = spaces.Length; // 10
var result = spaces.Split(' ');
lenght = result.Length;  // 11

I couldn't figure out why result.Length returns 11 empty strings. while I have 10 spaces in my input string ? I also tried it with a letter for example "a" and that doesn't make any difference:

var letters = string.Join("",Enumerable.Repeat("a", 10));
int lenght = letters.Length; // 10
var result = letters.Split('a');
lenght = result.Length;  // 11

In the documentation it says:

If two delimiters are adjacent, or a delimiter is found at the beginning or end of this instance, the corresponding array element contains Empty.

So I understand why I'm getting empty strings but I don't understand where is that extra element coming from?

There is an example in the documentation:

var input = "42..12..19";
var result = input.Split('.');

That returns five result and two of them are empty strings.Not three .

So is this the default and expected behaviour, or is it a bug or something?

Not a bug and totally expected behavior.

Look at it this way:

1-2-3

split on the - . This leads to 3 elements: 1,2 and 3.

Now take --3 and split on the dash again. Also 3 elements with the first 2 being empty.

A delimiter is essentially an element that is between two other elements. The elements it is between can be empty. So if you have 10 spaces and are splitting on spaces then you will always have 11 elements.

Your last example with "42..12..19" being split on . is essentially: 42.EMPTY.12.EMPTY.19 Which is 5 elements.

It's matching an empty element after the last space. In your last example, place a . at the end of the string and you'll get 6 elements even though you only have 5 separators. In fact, just look at that example - there are 5 elements but only 4 separators. In general, you'll always have one more element than the number of separators because there will be an element before each separator and one after the last one.

Consider this:

"1 2 3 4 5 6 7 8 9 10 11"

There are 10 spaces in the above, and 11 numbers. Each space separates the previous number from the next. The resulting array will have the same length if you remove the numbers. This is expected.

In your example, the beginning of the string is an element, up to the first delimiter. Since a delimiter is the first character, the first element of the array is empty. Afterwards, there is an empty array item added for each additional space.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM