简体   繁体   中英

Confusions with String's split method

I've gone through the String's split method documentation but the results are not as expected. When we split a string with the limit argument set to a negative value it always append an empty value. Why should it do that? Consider some cases

// Case 1
String str = "1#2#3#";
System.out.println(str.split("#").length); // Prints 3
System.out.println(str.split("#", -1).length); // Prints 4

What i would expect here is both prints 3.

// Case 2
str = "";
System.out.println(str.split("#").length); // Prints 1
System.out.println(str.split("#", -1).length); // Prints 1

Now since no match is found the usual split method without limit was supposed to print 0 but it creats an array with an empty string.

// Case 3
str = "#";
System.out.println(str.split("#").length); // Prints 0
System.out.println(str.split("#", -1).length); // Prints 2

Now i have a match and the split method without limit argument works fine. Its is my expected output but why wouldnt it create an empty array in this case as in case 2?

// Case 4
str = "###";
System.out.println(str.split("#").length); // Prints 0
System.out.println(str.split("#", -1).length); // Prints 4

Here first split method is as expected but why does the second one gives 4 instead of 3?

// Case 5
str = "1#2#3#";
System.out.println(str.split("#", 0).length); // Prints 3
System.out.println(str.split("#", 3).length); // Prints 3
System.out.println(str.split("#", 4).length); // Prints 4

Now the last case with positive limit. If the positive amount is <= the number of match the result is as expected. But if we give a higher positive limit it again appends an empty string to the resulting array.

From the JavaDoc for String

The limit parameter controls the number of times the pattern is applied and therefore affects the length of the resulting array. If the limit n is greater than zero then the pattern will be applied at most n - 1 times, the array's length will be no greater than n, and the array's last entry will contain all input beyond the last matched delimiter. If n is non-positive then the pattern will be applied as many times as possible and the array can have any length . If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.

Emphasis mine.

In the negative limit case empty matches are not discarded so, if I represent empty with E :

1#2#3# -> 1 # 2 # 3 # E
E      -> E
#      -> E # E
###    -> E # E # E # E

In your last example (with a positive limit), empty trailing space is only discarded if n == 0 .

The main source of confustion comes from the often missed section of the doc:

... If n is zero then ..., and trailing empty strings will be discarded .

Once you get that everything makes sense.

From the documentation

The limit parameter controls the number of times the pattern is applied and therefore affects the length of the resulting array. If the limit n is greater than zero then the pattern will be applied at most n - 1 times, the array's length will be no greater than n, and the array's last entry will contain all input beyond the last matched delimiter. If n is non-positive then the pattern will be applied as many times as possible and the array can have any length . If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.

It appears the negative limit behavior is predefined as maximize matches, and store anything else at the end.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM