简体   繁体   中英

Regular expression works with java.util.regex.Pattern but not com.oroinc.text.regex.Perl5Matcher

I came across a bug today in our legacy code which was using the Perl5Compiler and Perl5Matcher using the following regular expression to validate a UK postcodes:

((?i)(([A-Z]{2}[0-9]{1,2})|([A-Z]{1,2}[0-9][A-Z])|([A-Z][0-9]{1,2}))\\s([0-9][A-Z]{2})|(BFPO\\s\\d{1,4})|(GIR\\s0AA))

However, it failed to validate correctly for postcodes such as 'G12 4NNT' (the last section is only allowed to be a number followed by 2 letters in this case). I fixed this by using the java.util.regex.Pattern class which correctly uses the above regular expression and passes all of my unit tests.

However, now I'm curious why it didn't work with the Perl5 ones. Is there a fundemental difference with regular expression syntax used by the two APIs?

I think the problem is the same than in the question to the above linked answer .

If you use in Java the matches() method:

text.matches("((?i)(([A-Z]{2}[0-9]{1,2})|([A-Z]{1,2}[0-9][A-Z])|([A-Z][0-9]{1,2}))\\s([0-9][A-Z]{2})|(BFPO\\s\\d{1,4})|(GIR\\s0AA))");

it matches against the complete string, to have the same behaviour in Perl, you have to anchors around your expression:

^((?i)(([A-Z]{2}[0-9]{1,2})|([A-Z]{1,2}[0-9][A-Z])|([A-Z][0-9]{1,2}))\\s([0-9][A-Z]{2})|(BFPO\\s\\d{1,4})|(GIR\\s0AA))$

^ matches the start of the string

$ matches the end of the string

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM