[英]Regular expression for hyphens mixed words
I can use string.split("\\\\W+")
to have words containing only characters. 我可以使用string.split("\\\\W+")
使单词仅包含字符。
However: 然而:
I don't want break down words such as "re-use" into "re" & "use" . 我不想将诸如“ re-use”之类的单词分解为“ re”和“ use” 。
And also words like "out-of-the-way" with multiple hyphens. 还有带有多个连字符的“偏僻”之类的词。
I want to break "and--oh" into "and" & "oh" . 我想将“ and--oh”分解为“ and”和“ oh” 。
How can I possibly achieve that? 我怎么可能实现呢?
试试这个正则表达式:
string.split("[^\\w\\-]+|--+")
You can replace continuous hyphens to a special character firstly, and then do the simple regex split. 您可以先将连续的连字符替换为特殊字符,然后再进行简单的正则表达式拆分。
Please refer to the code below. 请参考下面的代码。
public class Test {
public static void main(String args[]){
String str = "This is^^some@@words-apple-banana--orange";
str = str.replaceAll("[-]{2,}", "@");
System.out.println(str);
String regex = "[^\\w-]+";
String arr[] = str.split(regex);
for(String item:arr){
System.out.println(item);
}
}
}
The result is: 结果是:
This are^^some@@words-apple-banana@orange
This
are
some
words-apple-banana
orange
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.