简体   繁体   English

特定的正则表达式模式

[英]Specific Regex Pattern

I wish to take a string input from the user and extract words or numbers like so: 我希望从用户那里得到一个字符串输入,并像这样提取单词或数字:

String problem = "I'm lo#o@king t%o ext!r$act a^ll 6 su*bs(tr]i{ngs.";

String[] solve = {"I'm", "looking", "to", "extract", "all", "6", "substrings"};

Basically, I want to extract numbers and words with complete disregard to punctuation except apostrophes. 基本上,我想提取数字和单词,而除了撇号外,完全不考虑标点符号。 I know how to get words and strings but I can't seem to figure out this tricky part. 我知道如何获取单词和字符串,但似乎无法弄清楚这一棘手的部分。

You could do like the below. 您可以像下面这样。

String s = "I'm lo#o@king t%o ext!r$act a^ll 6 su*bs(tr]i{ngs.";
String parts[] = s.replaceAll("[^\\s\\w']|(?<!\\b)'|'(?!\\b)", "").split("\\s+");
System.out.println(Arrays.toString(parts));

Output: 输出:

[I'm, looking, to, extract, all, 6, substrings]

Explanation: 说明:

  • [^\\\\s\\\\w'] matches any character but not of space or single quote or word character. [^\\\\s\\\\w']匹配任何字符,但不匹配空格或单引号或单词字符。

  • (?<!\\\\b)'(?!\\\\b) matches the ' symbol only if it's not preceded and not followed by a word character. (?<!\\\\b)'(?!\\\\b)仅在'符号不位于单词字符后且与之不匹配的情况下才与'符号匹配。

  • replaceAll function replaces all the matched characters with an empty string. replaceAll函数将所有匹配的字符替换为空字符串。

  • Finally we do splitting on the resultant string according to one or more space characters. 最后,我们根据一个或多个空格字符对结果字符串进行分割。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM