[英]Match all numeric characters without letters or accented letters before or after
I try this: 我试试这个:
\b\d+\b
but for this string: 但对于这个字符串:
0225 : appt, (parking) niv -2 0015_1 5étage sqdqs25485 7871sdd
I want to find: 我想找到:
0225 2 0015 1
(?<![\p{M}\p{L}\d])\d+(?![\p{M}\p{L}\d])
You can achieve it this way.See demo. 你可以通过这种方式实现它。参见演示。
https://regex101.com/r/fM9lY3/24 https://regex101.com/r/fM9lY3/24
Try with: 试试:
(?<![\p{L}\d])(\d+)(?![\p{L}\d])
where: 哪里:
(?<![\\p{L}])
- negative lookbehind for single code point in the category "letter", (?<![\\p{L}])
- “字母”类别中单个代码点的负向后视, (\\d+)
- one or more digits, (\\d+)
- 一个或多个数字, (?![\\p{L}])
- negative lookahead for single code point in the category "letter", (?![\\p{L}])
- “字母”类别中单个代码点的负前瞻, You can use the following code to obtain the required numbers: 您可以使用以下代码获取所需的数字:
String s = "0225 : appt, (parking) niv -2 0015_1 5étage";
Pattern pattern = Pattern.compile("(?<=_|\\b)\\d+(?=\\b|_)", Pattern.UNICODE_CHARACTER_CLASS);
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
System.out.println(matcher.group(0));
}
See IDEONE demo 请参阅IDEONE演示
The regex means match 1 or more digits ( \\d+
) only if they are preceded with _
or a word boundary ( (?<=_|\\\\b)
) and followed by a word boundary or an underscore ( (?=\\\\b|_)
). 正则表达式意味着匹配1个或多个数字( \\d+
)只有在它们前面带_
或单词边界( (?<=_|\\\\b)
)并且后跟单词边界或下划线( (?=\\\\b|_)
)。
Use (?U)
flag (or Pattern.UNICODE_CHARACTER_CLASS
), since \\b
without (?U)
flag is broken. 使用(?U)
标志(或Pattern.UNICODE_CHARACTER_CLASS
),因为\\b
没有(?U)
标志被破坏。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.