简体   繁体   中英

Java Regex Split \\S --> Strange result for String split methode \\S

I am puzzled about the split methode with regex in Java. It is a rather theoretical question that poped up and i can't figure it out.

I found this answer: Java split by \\\\S but the advice to use \\\\s instead of \\\\S does not explain what is happening here.

Why: does quote.split("\\\\S") has 2 results in case A and 8 in case B ?

case A)

String quote = " x xxxxxx";
String[] words = quote.split("\\S"); 
System.out.print("\\S >>\t");
for (String word : words) {
  System.out.print(":" + word);
}
System.out.println(words.length);

Result:

\\\\S >> : : 2

case B)

String quote = " x xxxxxx ";
String[] words = quote.split("\\S"); 
System.out.print("\\S >>\t");
for (String word : words) {
  System.out.print(":" + word);
}
System.out.println(words.length);

Result:

\\\\S >> : : :::::: 8

It would be wonderfull to understand what happens here. Thanks in advance.

As Jongware noticed, the documentation for String.split(String) says:

This method works as if by invoking the two-argument split method with the given expression and a limit argument of zero. Trailing empty strings are therefore not included in the resulting array.

So it works somewhat like this:

"a:b:::::".split(":")  === removeTrailing([a,b,,,,,])  === [a,b]
"a:b:::::c".split(":") === removeTrailing([a,b,,,,,c]) === [a,b,,,,,c]

And in your example:

" x xxxxxx".split("\\S")  === removeTrailing([ , ,,,,,,])  === [ , ]
" x xxxxxx ".split("\\S") === removeTrailing([ , ,,,,,, ]) === [ , ,,,,,, ]

To collapse multiple delimiters into one, use \\S+ pattern.

" x xxxxxx".split("\\S+")  === removeTrailing([ , ,])  === [ , ]
" x xxxxxx ".split("\\S+") === removeTrailing([ , , ]) === [ , , ]

As suggested in the comments, to maintain the trailing empty strings we can use overloaded version of split method ( String.split(String, int) ) with a negative number passed as limit.

"a:b:::::".split(":", -1)  === [a,b,,,,,]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM