简体   繁体   English

无法替换Java中的字符串

[英]Trouble replacing strings in Java

If i have this String: 如果我有这个字符串:

String line = "This, is Stack; Overflow.";

And want to split it into the following array of strings: 并希望将其拆分为以下字符串数组:

String[] array = ...

so the array contains this output: 因此数组包含以下输出:

["This",",","is","Stack",";","Overflow","."]

What regex expression should i put into the split() method ? 我应该在split()方法中放什么正则表达式?

Just split your input according to the spaces or the boundaries which exists between a word character and a non-word character, vice-versa. 只需根据单词字符和非单词字符之间存在的空格或边界来分割输入,反之亦然。

String s = "This, is Stack; Overflow.";
String parts[] = s.split("\\s|(?<=\\w)(?=\\W)");
System.out.println(Arrays.toString(parts));

\\s matches any kind of whitespace character, \\w matches a word character and \\W matches a non-word character. \\s匹配任何一种空白字符, \\w匹配一个单词字符, \\W匹配一个非单词字符。

  • \\s matches a space character. \\s匹配一个空格字符。
  • (?<=\\\\w) Positive look-behind which asserts that the match must be preceded by a word character ( az , AZ , 0-9 , _ ). (?<=\\\\w)正向后看,它断言匹配必须以单词字符( azAZ0-9_ )开头。
  • (?=\\\\W) Positive look-ahead which asserts that the match must be followed by a non-word character( any character other than the word character ). (?=\\\\W)正向超前,断言匹配必须后面跟一个非单词字符(单词字符以外的任何字符 )。 So this (?<=\\\\w)(?=\\\\W) regex matches only the boundaries not a character. 因此,此(?<=\\\\w)(?=\\\\W)正则表达式仅匹配边界,而不匹配字符。

  • Thus splitting the input according to the matches spaces and the boundaries will give you the desired output. 因此,根据匹配空间和边界分割输入,将为您提供所需的输出。

DEMO DEMO

OR 要么

String s = "This, is Stack; Overflow.";
String parts[] = s.split("\\s|(?<=\\w)(?=\\W)|(?<=[^\\w\\s])(?=\\w)");
System.out.println(Arrays.toString(parts));

Output: 输出:

[This, ,, is, Stack, ;, Overflow, .]

You can do that with this pattern: 您可以使用以下模式进行操作:

\\s+|(?<=\\S)(?=[^\\w\\s])|(?<=[^\\w\\s])\\b

it trims whitespaces and deals with consecutive special characters, example: 它修剪空格并处理连续的特殊字符,例如:

With ;This, is Stack; ;; Overflow. ;This, is Stack; ;; Overflow. ;This, is Stack; ;; Overflow.

you obtain: [";", "This", ",", "is", "Stack", ";", ";", ";", "Overflow", "."] 您获得: [";", "This", ",", "is", "Stack", ";", ";", ";", "Overflow", "."]

But obviously, the more efficient way is to not use the split method but the find method with this pattern: 但显然,更有效的方法是不使用split方法,而是使用具有以下模式的find方法:

\\w+|[^\\w\\s]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM