[英]Regx + Java : split a text into words and removing punctuation only if they are alone or at the end
我試圖將字符串拆分成單詞,但我想保留“ abc”作為單詞,並且僅在單獨出現或在單詞末尾刪除標點符號,例如
"a.b.c" --> "a.b.c"
"a.b." --> "a.b"
例如
String str1 = "abc a.b a. . b, , test"; should return "abc","a.b","a","b","test"
您可以使用:
String str1 = "abc a.b a. . b, , test";
String[] toks = str1.split("\\p{Punct}*\\s+[\\s\\p{Punct}]*");
for (String tok: toks)
System.out.printf(">>> [%s]%n", tok);
>>> [abc]
>>> [a.b]
>>> [a]
>>> [b]
>>> [test]
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.