[英]Regex: Match wildcard followed by variable length of digits
I'm trying to extract the personal number from a stringlike Personal number: 123456
with the following regex: 我正在尝试使用以下正则表达式从类似字符串的
Personal number: 123456
提取个人号码:
(Personal number|Personalnummer).*(\d{2,10})
When trying to get the second group, it will only contain the last 2 digits of the personal number. 当尝试获得第二组时,它将仅包含个人号码的后两位。 If I change the digit range to
{3,10}
it will match the last 3 digits of the personal number. 如果我将数字范围更改为
{3,10}
,它将与个人号码的最后3位数字匹配。
Now I cannot just add the whitespaces as additional group, because I cannot be sure that there will be always whitespaces - there might be none or some other characters, but the personal number will be always at the end. 现在,我不能只是将空格添加为附加组,因为我无法确定总会有空格-可能没有空格或其他字符,但是个人号码始终在结尾。
Is there anyway I could instruct the Parser to get the whole digit string? 无论如何,我可以指示解析器获取整个数字字符串吗?
.*
is working as greedy quantifier for the regex. .*
充当正则表达式的贪婪量词。 It ends up eating all the matching characters except the last 2 that it has to leave to match the string. 最终它会吃掉所有匹配的字符,除了最后两个必须与之匹配的字符。
You have to make it reluctant by applying ?
您必须通过申请使其舍不得
?
. 。 Like below
像下面
(Personal number|Personalnummer).*?(\d{2,10})
Now it should work perfectly. 现在应该可以正常工作了。
You can also convert the first group into a non capturing group, then you'll get only the number that you want in the answer like below. 您还可以将第一个组转换为非捕获组,然后在如下所示的答案中仅获得所需的数字。
(?:Personal number|Personalnummer).*?(\d{2,10})
Use a reluctant quantifier on the wildcard match (eg *?
). 在通配符匹配上使用一个不愿意的量词 (例如
*?
)。 For instance .*?
例如
.*?
will result in the full numeric expression: 将导致完整的数字表达式:
Pattern p = Pattern.compile("(Personal number|Personalnummer).*?(\\d{2,10})");//note the ?
Matcher m = p.matcher("Personal number: 123456");
if ( m.find() ){
System.out.println(m.group(2));
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.