简体   繁体   English

Java Regex Split \\\\ S - > String split methode \\\\ S的奇怪结果

[英]Java Regex Split \\S --> Strange result for String split methode \\S

I am puzzled about the split methode with regex in Java. 我对Java中使用正则表达式的拆分方法感到困惑。 It is a rather theoretical question that poped up and i can't figure it out. 这是一个相当理论化的问题,加速了,我无法弄明白。

I found this answer: Java split by \\\\S but the advice to use \\\\s instead of \\\\S does not explain what is happening here. 我找到了这个答案: Java被\\\\ S拆分但是使用\\\\ s而不是\\\\ S的建议并没有解释这里发生了什么。

Why: does quote.split("\\\\S") has 2 results in case A and 8 in case B ? 为什么:quote.split(“\\\\ S”)在案例A中有2个结果,在案例B中有8个结果吗?

case A) 案例A)

String quote = " x xxxxxx";
String[] words = quote.split("\\S"); 
System.out.print("\\S >>\t");
for (String word : words) {
  System.out.print(":" + word);
}
System.out.println(words.length);

Result: 结果:

\\\\S >> : : 2 \\\\ S >> :: 2

case B) 案例B)

String quote = " x xxxxxx ";
String[] words = quote.split("\\S"); 
System.out.print("\\S >>\t");
for (String word : words) {
  System.out.print(":" + word);
}
System.out.println(words.length);

Result: 结果:

\\\\S >> : : :::::: 8 \\\\ S >> ::: :::::: 8

It would be wonderfull to understand what happens here. 理解这里发生的事情会很棒。 Thanks in advance. 提前致谢。

As Jongware noticed, the documentation for String.split(String) says: 正如Jongware注意到的, String.split(String)的文档说:

This method works as if by invoking the two-argument split method with the given expression and a limit argument of zero. 此方法的作用就像通过调用具有给定表达式和limit参数为零的双参数split方法一样。 Trailing empty strings are therefore not included in the resulting array. 因此,结尾的空字符串不包含在结果数组中。

So it works somewhat like this: 所以它有点像这样:

"a:b:::::".split(":")  === removeTrailing([a,b,,,,,])  === [a,b]
"a:b:::::c".split(":") === removeTrailing([a,b,,,,,c]) === [a,b,,,,,c]

And in your example: 在你的例子中:

" x xxxxxx".split("\\S")  === removeTrailing([ , ,,,,,,])  === [ , ]
" x xxxxxx ".split("\\S") === removeTrailing([ , ,,,,,, ]) === [ , ,,,,,, ]

To collapse multiple delimiters into one, use \\S+ pattern. 要将多个分隔符合并为一个,请使用\\S+模式。

" x xxxxxx".split("\\S+")  === removeTrailing([ , ,])  === [ , ]
" x xxxxxx ".split("\\S+") === removeTrailing([ , , ]) === [ , , ]

As suggested in the comments, to maintain the trailing empty strings we can use overloaded version of split method ( String.split(String, int) ) with a negative number passed as limit. 正如评论中所建议的,为了维护尾随的空字符串,我们可以使用split方法( String.split(String,int) )的重载版本,其中负数作为限制传递。

"a:b:::::".split(":", -1)  === [a,b,,,,,]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM