简体   繁体   English

使用正则表达式的澳大利亚区域所需的地址解析和验证技术

[英]address parsing and validation technique required for Australia zone with regex

We need to implement the validation for the following field我们需要对以下字段进行验证

Street Address/ Business Address街道地址/公司地址

Conditions which needs to be taken care are as below需要注意的条件如下

Post office box, private bag NOT acceptable as address.邮政信箱、私人包不接受作为地址。 The address field should not start with the following:地址字段不应以以下内容开头:

NOTE: ^ = space注意: ^ = 空格

G^P^O    
GPO^    
G.P.O    
GPO.    
G.P.O. Box

P^O    
P.O    
PO^    
PO.    
P.O.    
P.O.B    
POBOX    
POST BOX    
POST OFFICE BOX    
P / O Box    
P/O Box    
P O Box    
P.O. Box

BOX^    
BOX.    
Private Bag^    
Private Bag.    
Locked Bag

This list needs to be configurable to allow for additional rules to be included at a later date.此列表需要可配置以允许在以后包含其他规则。 There is no need to validate for upper/lower case无需验证大小写

Can you suggest me this kind of validation can be better implement using Javascript or Java?你能建议我使用 Javascript 或 Java 更好地实施这种验证吗?

  1. if I use Java, is it advisable to use Regex class如果我使用 Java,是否建议使用正则表达式 class

  2. If Javascript, what should be my way如果 Javascript,我应该怎么做

Kindly share your sample code if you have any如果您有任何示例代码,请分享

I am checking if we use of Regex would be helpful in doing this?我正在检查我们使用正则表达式是否有助于这样做?

ps: Since this project cant use any google or yahoo api or any other paid APIs to parse the street address. ps:由于这个项目不能使用任何google或yahoo api或任何其他付费API来解析街道地址。

I wouldn't use regexp in that case.在那种情况下,我不会使用正则表达式。 Instead I'd put all "stop words" in a file, read it into a set and use a loop to verify:相反,我将所有“停用词”放在一个文件中,将其读入一个集合并使用循环来验证:

public static boolean isValid(String address) {
  Set<String> stopWords = getSet();  // some magic to get the loaded set
  for (String stopWord:stopWords) {
    if (address.trim().toLowerCase().startsWith(stopWord.toLowerCase())) {
       return false;
    }
  }
  return true;
}

Big advantage: the stop words are maintained in a file and not compiled into some regular expression that no one will understand next week and later.很大的优势:停用词保存在一个文件中,而不是编译成一些下周及以后没人会理解的正则表达式。

Not a regex guru my self, so i my answer will be the long way of doing it.我自己不是正则表达式大师,所以我的回答将是很长的路要走。

you can include the list of possible values in a regex:您可以在正则表达式中包含可能值的列表:

/(G\sP\sO)|(GPO\s)|(G\.P\.O)|(G\.P\.O\.\sBox)| .... and so one

I don't think it's the most efficient way of solving the problem but it will still work.我认为这不是解决问题的最有效方法,但它仍然有效。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM