简体   繁体   中英

Regex: Match wildcard followed by variable length of digits

I'm trying to extract the personal number from a stringlike Personal number: 123456 with the following regex:

(Personal number|Personalnummer).*(\d{2,10})

When trying to get the second group, it will only contain the last 2 digits of the personal number. If I change the digit range to {3,10} it will match the last 3 digits of the personal number.

Now I cannot just add the whitespaces as additional group, because I cannot be sure that there will be always whitespaces - there might be none or some other characters, but the personal number will be always at the end.

Is there anyway I could instruct the Parser to get the whole digit string?

.* is working as greedy quantifier for the regex. It ends up eating all the matching characters except the last 2 that it has to leave to match the string.

You have to make it reluctant by applying ? . Like below

(Personal number|Personalnummer).*?(\d{2,10})

Now it should work perfectly.

You can also convert the first group into a non capturing group, then you'll get only the number that you want in the answer like below.

(?:Personal number|Personalnummer).*?(\d{2,10})

Use a reluctant quantifier on the wildcard match (eg *? ). For instance .*? will result in the full numeric expression:

Pattern p = Pattern.compile("(Personal number|Personalnummer).*?(\\d{2,10})");//note the ?
Matcher m = p.matcher("Personal number:    123456");
if ( m.find() ){
    System.out.println(m.group(2));
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM