[英]Split string by whitespaces removes new line characters
I'm splitting a string by whitespaces, but for some reason the new line characters are being removed. 我正在用空格分割字符串,但由于某种原因,新行字符被删除。 For example:
例如:
String[] splitSentence = "Example sentence\n\n This sentence is an example".
split("\\s+");
splitSentence will contain this: splitSentence将包含:
["Example", "sentence", "This", "sentence", "is", "an", "example"]
and if I make this: 如果我这样做:
String[] splitSentence = "Example sentence\n\n This sentence is an example".
split("\\s");
splitSentence will contain this: splitSentence将包含:
["Example", "sentence", "", "", "This", "sentence", "is", "an", "example"]
I'm trying to achieve something like this: 我正在努力实现这样的目标:
["Example", "sentence\n\n", "This", "sentence", "is", "an", "example"]
Or like this: 或者像这样:
["Example", "sentence", "\n", "\n", "This", "sentence", "is", "an", "example"]
I've tried a lot of things with no luck... Any help will be appreciated. 我已经尝试了很多没有运气的事情......任何帮助都将受到赞赏。
String[] splitSentence = "Example sentence\n\n This sentence is an example".
split(' ');
this version should work, so empty space will be remove only and not new line. 这个版本应该工作,所以空白空间将只删除而不是新行。
Split by spaces and tabs (without newline): 按空格和制表符分割(不带换行符):
String[] splitSentence = "Example sentence\n\n This sentence is an example".split("[ \t]+");
Result: ["Example", "sentence\\n\\n", "This", "sentence", "is", "an", "example"]
结果:
["Example", "sentence\\n\\n", "This", "sentence", "is", "an", "example"]
In a regex, \\s
is defined to be equivalent to the characters in this set: 在正则表达式中,
\\s
被定义为等同于此集合中的字符:
[ \t\n\x0B\f\r]
(See the javadoc ). (见javadoc )。 If you don't want newlines to be treated like spaces, then you can write your own set:
如果您不希望换行符被视为空格,那么您可以编写自己的集合:
splitSentence = "Example sentence\n\n This sentence is an example".split("[ \t\\x0B\f\r]+");
(or eliminate other characters you don't want the split
to recognize). (或消除您不希望
split
识别的其他字符)。
( \\t
is TAB, \\x0B
is vertical tab, \\f
is FF (form feed), \\r
is CR) (
\\t
是TAB, \\x0B
是垂直制表符, \\f
是FF(换页), \\r
是CR)
EDIT: This method seems to produce the second result you mentioned, where the
\\n
's are returned as separate strings: 编辑:这个方法似乎产生了你提到的第二个结果,其中
\\n
是作为单独的字符串返回的:
splitSentence = "Example sentence\n\n This sentence is an example".split("[ \t\\x0B\f\r]+|(?=\n)");
This uses lookahead to split at a point that is immediately followed by \\n
, but doesn't treat \\n
as a delimiter that will be removed from the result. 这使用前瞻分割在紧跟着
\\n
的点之后,但不将\\n
视为将从结果中删除的分隔符。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.