简体   繁体   English

String.split处理空格和制表符的奇怪行为

[英]String.split's weird behaviour dealing with spaces and tabs

I have a string consisting of tabs and spaces and some arbitrary characters. 我有一个由制表符和空格以及一些任意字符组成的字符串。 The string below is made up of space space tab tab 1 space tab -2 tab space + space . 下面的字符串由space space tab tab 1 space tab -2 tab space + space

import java.util.Arrays;

String[] s = "          1   -2   + ".split("[\\s]+");
System.out.println(Arrays.toString(s));

Running split with regex [\\s+] one would expect to get [1, -2, +] , however the returned array I get on my machine (OS X, JDK1.6.0_37) is [, 1, -2, +] . 使用正则表达式[\\s+]运行拆分可能会得到[1, -2, +] ,但是我在我的机器上获得的返回数组(OS X,JDK1.6.0_37)是[, 1, -2, +]

It turns out the first element is simply "blank" ( s[0].equals("") returns true ) and so it should have been matched by \\s . 事实证明,第一个元素只是“空白”( s[0].equals("")返回true )所以它应该被\\s匹配。

What am I missing? 我错过了什么?

If while splitting your string, the first character of the string is amongst the delimiter, then the first element of the generated array is always an empty string . 如果在拆分字符串时,字符串的第一个字符在分隔符中,则生成的数组的第一个元素始终为empty string

Take it this way, your string always starts with an empty string . 这样,你的字符串总是以empty string开头。 So, your delimiter - \\s+ will be divide " a" string(note the leading whitespace) in two parts, first before \\s+ which is empty string "" , and one after it, which is a . 所以,你的分隔符 - \\s+将分为两个部分中的" a"字符串(注意前导空格),首先是在\\s+之前是空字符串 "" ,而在它之后是a ,它是a

So, the output you got is obvious. 所以,你得到的输出是显而易见的。

It turns out the first element is simply "blank" (s[0].equals("") returns true) and so it should have been matched by \\s. 事实证明,第一个元素只是“空白”(s [0] .equals(“”)返回true)所以它应该被\\ s匹配。

No it shouldn't have been. 不,不应该。 A space is not an empty string. 空格不是空字符串。 There is difference between them. 他们之间有区别。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM