繁体   English   中英

如何在保留标点符号的java中将字符串拆分为单词?

[英]How to split string into words in java preserving punctuation?

这是输入

hello; this is cool?
great,   awesome

我希望我的输出是

hello;
this
is
cool?
great,
awesome

我基本上认为一个词中有标点符号。 这是我对应用程序的定义。 我想根据空格、制表符和换行符拆分单词。 大多数stackoverflow问题和答案都假设单词不包含标点符号,那么我将如何解决这个问题?

直接在代码中注释和解释:

//1st possibility: every single whitespace character (space, tab, newline, carriage return, vertical tab) will be treated as a separator
String s="hello; this is cool?\ngreat,   awesome";
String[] array1 = s.split("\\s");
System.out.println("======first case=====");
for(int i=0; i<array1.length; i++)
    System.out.println(array1[i]);

//2nd possibility (groups of consecutive whitespace characters (space, tab, newline, carriage return, vertical tab) will be treated as a single separator
String[] array2 = s.split("\\s+");
System.out.println("=====second case=====");
for(int i=0; i<array2.length; i++)
    System.out.println(array2[i]);
//notice the difference in the output!!!

输出:

======first case=====
hello;
this
is
cool?
great,
         <----- Notice the empty string
         <----- Notice the empty string
awesome
=====second case=====
hello;
this
is
cool?
great,
awesome

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM