简体   繁体   English

String.Split返回奇怪的结果

[英]String.Split returns weird result

I noticed that if I Split a string by white-space that contains only white-spaces, it returns unexpected result.Consider this: 我注意到,如果我通过仅包含空格的空格Split字符串,则会返回意外的结果。

var spaces = string.Join("",Enumerable.Repeat(" ", 10));
int lenght = spaces.Length; // 10
var result = spaces.Split(' ');
lenght = result.Length;  // 11

I couldn't figure out why result.Length returns 11 empty strings. 我不知道为什么result.Length返回11空字符串。 while I have 10 spaces in my input string ? 而我的输入字符串中有10空格? I also tried it with a letter for example "a" and that doesn't make any difference: 我还尝试了一个字母,例如"a" ,这没有任何区别:

var letters = string.Join("",Enumerable.Repeat("a", 10));
int lenght = letters.Length; // 10
var result = letters.Split('a');
lenght = result.Length;  // 11

In the documentation it says: 文档中说:

If two delimiters are adjacent, or a delimiter is found at the beginning or end of this instance, the corresponding array element contains Empty. 如果两个定界符相邻,或者在此实例的开头或结尾找到一个定界符,则对应的数组元素包含Empty。

So I understand why I'm getting empty strings but I don't understand where is that extra element coming from? 所以我理解为什么我得到的是空字符串,但我不知道多余的元素是从哪里来的?

There is an example in the documentation: 文档中有一个示例:

var input = "42..12..19";
var result = input.Split('.');

That returns five result and two of them are empty strings.Not three . 返回五个结果,其中两个是空字符串。不是三个

So is this the default and expected behaviour, or is it a bug or something? 那么,这是默认的行为吗?

Not a bug and totally expected behavior. 不是错误,完全是预期的行为。

Look at it this way: 这样看:

1-2-3

split on the - . -上分开。 This leads to 3 elements: 1,2 and 3. 这导致3个元素:1,2和3。

Now take --3 and split on the dash again. 现在取--3并再次在破折号上分割。 Also 3 elements with the first 2 being empty. 还有3个元素,其中前2个为空。

A delimiter is essentially an element that is between two other elements. 分隔符本质上是位于其他两个元素之间的元素。 The elements it is between can be empty. 其间的元素可以为空。 So if you have 10 spaces and are splitting on spaces then you will always have 11 elements. 因此,如果您有10个空格并在空格上进行拆分,那么您将始终有11个元素。

Your last example with "42..12..19" being split on . 您与最后一个例子"42..12..19"被拆上. is essentially: 42.EMPTY.12.EMPTY.19 Which is 5 elements. 本质上是: 42.EMPTY.12.EMPTY.19这是5个元素。

It's matching an empty element after the last space. 它匹配最后一个空格之后的空元素。 In your last example, place a . 在您的最后一个示例中,放置一个. at the end of the string and you'll get 6 elements even though you only have 5 separators. 在字符串的末尾,即使只有5个分隔符,您也会得到6个元素。 In fact, just look at that example - there are 5 elements but only 4 separators. 实际上,仅查看该示例-有5个元素,但只有4个分隔符。 In general, you'll always have one more element than the number of separators because there will be an element before each separator and one after the last one. 通常,您将始终拥有比分隔符数量多一个元素,因为在每个分隔符之前将有一个元素,而在最后一个分隔符之后将有一个元素。

Consider this: 考虑一下:

"1 2 3 4 5 6 7 8 9 10 11" “ 1 2 3 4 5 6 7 8 9 10 11”

There are 10 spaces in the above, and 11 numbers. 上面有10个空格,还有11个数字。 Each space separates the previous number from the next. 每个空格将前一个数字与下一个数字分开。 The resulting array will have the same length if you remove the numbers. 如果删除数字,结果数组将具有相同的长度。 This is expected. 这是预期的。

In your example, the beginning of the string is an element, up to the first delimiter. 在您的示例中,字符串的开头是一个元素,直到第一个定界符为止。 Since a delimiter is the first character, the first element of the array is empty. 由于分隔符是第一个字符,因此数组的第一个元素为空。 Afterwards, there is an empty array item added for each additional space. 然后,为每个额外的空间添加一个空数组项。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM