[英]Specific Regex Pattern
I wish to take a string input from the user and extract words or numbers like so: 我希望从用户那里得到一个字符串输入,并像这样提取单词或数字:
String problem = "I'm lo#o@king t%o ext!r$act a^ll 6 su*bs(tr]i{ngs.";
String[] solve = {"I'm", "looking", "to", "extract", "all", "6", "substrings"};
Basically, I want to extract numbers and words with complete disregard to punctuation except apostrophes. 基本上,我想提取数字和单词,而除了撇号外,完全不考虑标点符号。 I know how to get words and strings but I can't seem to figure out this tricky part.
我知道如何获取单词和字符串,但似乎无法弄清楚这一棘手的部分。
You could do like the below. 您可以像下面这样。
String s = "I'm lo#o@king t%o ext!r$act a^ll 6 su*bs(tr]i{ngs.";
String parts[] = s.replaceAll("[^\\s\\w']|(?<!\\b)'|'(?!\\b)", "").split("\\s+");
System.out.println(Arrays.toString(parts));
Output: 输出:
[I'm, looking, to, extract, all, 6, substrings]
Explanation: 说明:
[^\\\\s\\\\w']
matches any character but not of space or single quote or word character. [^\\\\s\\\\w']
匹配任何字符,但不匹配空格或单引号或单词字符。
(?<!\\\\b)'(?!\\\\b)
matches the '
symbol only if it's not preceded and not followed by a word character. (?<!\\\\b)'(?!\\\\b)
仅在'
符号不位于单词字符后且与之不匹配的情况下才与'
符号匹配。
replaceAll
function replaces all the matched characters with an empty string. replaceAll
函数将所有匹配的字符替换为空字符串。
Finally we do splitting on the resultant string according to one or more space characters. 最后,我们根据一个或多个空格字符对结果字符串进行分割。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.