简体   繁体   中英

Validate UK postcode using regular expression in oracle

Below is the list of valid postcodes:

A1 1AA
A11 1AA
AA1 1AA
AA11 1AA
A1A 1AA
BFPO 1
BFPO 11
BFPO 111

I tried with (([AZ]{1,2}[0-9]{1,2})\\ ([0-9][AZ]{2}))|(GIR\\ 0AA)$ but it is not working. Could you please help me with proper query to validate all the postcode formats.

First, rather than guessing based on the set of data at hand, let's look at what UK postcodes are .

EC1V 9HQ

The first one or two letters is the postcode area and it identifies the main Royal Mail sorting office which will process the mail. In this case EC would go to the Mount Pleasant sorting office in London.

The second part is usually just one or two numbers but for some parts of London it can be a number and a letter. This is the postcode district and tells the sorting office which delivery office the mail should go to.

This third part is the sector and is usually just one number. This tells the delivery office which local area or neighbourhood the mail should go to.

The final part of the postcode is the unit code which is always two letters. This identifies a group of up to 80 addresses and tells the delivery office which postal route (or walk) will deliver the item.

Digesting that...

  1. 1 or 2 letters.
  2. A number and maybe an alphanumeric.
  3. A space.
  4. "Usually" a number, but I can't find any instances otherwise.
  5. 2 letters.
\A[[:alpha:]]{1,2}\d[[:alnum:]]? \d[[:alpha:]]{2}\z

We can't use \\w because that contains an underscore.

I used the more exact \\A and \\z over ^ and $ because \\A and \\z match the exact beginning and end of the string, whereas ^ and $ match the beginning and end of a line. $ in particular is tolerant of a trailing newline.


Of course, there are special cases. XXXX 1ZZ for various overseas territories, XXXX is enumerated.

\A(ASCN|STHL|TDCU|BBND|BIQQ|FIQQ|PCRN|SIQQ|TKCA) 1ZZ\z

Then a couple of really special cases.

  • GIR 0AA for Girobank .
  • AI-2640 for Anguilla.
\A(AI-2640|GIR 0AA)\z

Put them all together into one big (...|...|...) mess. It's good to build the query in three pieces and put it together with the x modifier to ignore whitespace.

REGEXP_LIKE(
    postcode,
    '\A
     (
      [[:alpha:]]{1,2}\d[[:alnum:]]?\ \d[[:alpha:]]{2}\z   |
      (ASCN|STHL|TDCU|BBND|BIQQ|FIQQ|PCRN|SIQQ|TKCA)\ 1ZZ  |
      (AI-2640|GIR\ 0AA)
     )
     \z',
    'x'
)

Or you can make the basic regex less strict and accept 2-4 alphanumerics for the first part. Then there's only the special case for Anguilla to worry about.

\A([[:alnum:]]{2,4} \d[[:alpha:]]{2}|AI-2640)\z

On the downside, this will let in post codes that don't exist. On the up side, you don't have to keep tweaking for additional special cases. That's probably fine for this level of filtering.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM