![](/img/trans.png)
[英]Java regex, how to split on dot, whitespace, and keep quoted words together
[英]Java regex to split along words, punctuation, and whitespace, and keep all in an array
我正在尝试将一个句子分成一组字符串。 我想将所有单词,标点符号和空格保留在一个数组中。
例如:
“你好!我叫约翰·多伊。”
将分为:
["Hello", "!", " ", "My", " ", "name", " ", "is", " ", "John", " ", "Doe"]
我目前有以下代码行打破了我的句子:
String[] fragments = sentence.split("(?<!^)\\b");
但是,这遇到了一个错误,在该错误中,标点符号和后跟空格作为单个字符串进行计数。 如何修改我的正则表达式以解决此问题?
您可以尝试以下正则表达式:
(?<=\b|[^\p{L}])
"Hello! My name is John Doe.".split("(?<=\\b|[^\\p{L}])", 0)
// ⇒ ["Hello", "!", " ", "My", " ", "name", " ", "is", " ", "John", " ", "Doe", "."]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.