简体   繁体   中英

Strange behavior of Java String split() method

I have a method which takes a string parameter and split the string by # and after splitting it prints the length of the array along with array elements. Below is my code

public void StringSplitTesting(String inputString) {

        String tokenArray[] = inputString.split("#");

        System.out.println("tokenArray length is " + tokenArray.length
                + " and array elements are " + Arrays.toString(tokenArray));

    }

Case I : Now when my input is abc# the output is tokenArray length is 1 and array elements are [abc]

Case II : But when my input is #abc the output is tokenArray length is 2 and array elements are [, abc]

But I was expecting the same output for both the cases. What is the reason behind this implementation? Why split() method is behaving like this? Could someone give me proper explanation on this?

One aspect of the behavior of the one-argument split method can be surprising -- trailing nulls are discarded from the returned array.

Trailing empty strings are therefore not included in the resulting array.

To get a length of 2 for each case, you can pass in a negative second argument to the two-argument split method , which means that the length is unrestricted and no trailing empty strings are discarded.

Just take a look in the documentation:

Trailing empty strings are therefore not included in the resulting array.

So in case 1, the output would be {"abc", ""} but Java cuts the trailing empty String. If you don't want the trailing empty String to be discarded, you have to use split("#", -1) .

The observed behavior is due to the inherently asymmetric nature of the substring() method in Java:

This is the core of the implementation of split() :

         while ((next = indexOf(ch, off)) != -1) {
            if (!limited || list.size() < limit - 1) {
                list.add(substring(off, next));
                off = next + 1;
            } else {    // last one
                //assert (list.size() == limit - 1);
                list.add(substring(off, value.length));
                off = value.length;
                break;
            }
        }

The key to understanding the behavior of the above code is to understand the behavior of the substring() method:

From the Javadocs:

String java.lang.String.substring(int beginIndex, int endIndex)

Returns a new string that is a substring of this string. The substring begins at the specified beginIndex and extends to the character at index endIndex - 1. Thus the length of the substring is endIndex-beginIndex.

Examples:

"hamburger".substring(4, 8) returns "urge" (not "urger")

"smiles".substring(1, 5) returns "mile" (not "miles")

Hope this helps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM