简体   繁体   English

匹配所有数字字符,不带字母或重音字母

[英]Match all numeric characters without letters or accented letters before or after

I try this: 我试试这个:

\b\d+\b

but for this string: 但对于这个字符串:

0225 : appt, (parking) niv -2 0015_1 5étage sqdqs25485 7871sdd

I want to find: 我想找到:

0225 2 0015 1
(?<![\p{M}\p{L}\d])\d+(?![\p{M}\p{L}\d])

You can achieve it this way.See demo. 你可以通过这种方式实现它。参见演示。

https://regex101.com/r/fM9lY3/24 https://regex101.com/r/fM9lY3/24

Try with: 试试:

(?<![\p{L}\d])(\d+)(?![\p{L}\d])

where: 哪里:

  • (?<![\\p{L}]) - negative lookbehind for single code point in the category "letter", (?<![\\p{L}]) - “字母”类别中单个代码点的负向后视,
  • (\\d+) - one or more digits, (\\d+) - 一个或多个数字,
  • (?![\\p{L}]) - negative lookahead for single code point in the category "letter", (?![\\p{L}]) - “字母”类别中单个代码点的负前瞻,

DEMO DEMO

You can use the following code to obtain the required numbers: 您可以使用以下代码获取所需的数字:

String s = "0225 : appt, (parking) niv -2 0015_1 5étage";
Pattern pattern = Pattern.compile("(?<=_|\\b)\\d+(?=\\b|_)", Pattern.UNICODE_CHARACTER_CLASS);
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
    System.out.println(matcher.group(0)); 
} 

See IDEONE demo 请参阅IDEONE演示

The regex means match 1 or more digits ( \\d+ ) only if they are preceded with _ or a word boundary ( (?<=_|\\\\b) ) and followed by a word boundary or an underscore ( (?=\\\\b|_) ). 正则表达式意味着匹配1个或多个数字( \\d+ )只有在它们前面带_或单词边界( (?<=_|\\\\b) )并且后跟单词边界或下划线( (?=\\\\b|_) )。

Use (?U) flag (or Pattern.UNICODE_CHARACTER_CLASS ), since \\b without (?U) flag is broken. 使用(?U)标志(或Pattern.UNICODE_CHARACTER_CLASS ),因为\\b没有(?U)标志被破坏。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM