简体   繁体   English

正则表达式在Java中提供额外的输出

[英]Regex giving extra output in java

My code is like: 我的代码是这样的:

String try1 = " how abcd is a lake 3909 Witmer Road Niagara Falls NY 14305 and our adress is 120, 5th cross, 1st main, domlur, Bangalore 50071 nad 420, Fanboy Lane, NewYark, AS 12345";
String add1="( \\b+[0-9]{3,5}[, ]* (.*)[, ]* (.*)[, ]* [a-zA-Z]{2} [0-9]{5})";
Pattern p = Pattern.compile(add1);
Matcher m = p.matcher(try1);
if(m.find())
{ 
    System.out.println("Address ======> " + m.group());
}
else System.out.println("Address ======>Not found ");

I want only US addresses in output: 我只想在输出中使用美国地址:

[(3909 Witmer Road Niagara Falls NY 14305) and (420, Fanboy Lane, NewYark, AS 12345)]

but it's outputting like this: 但它的输出是这样的:

(3909 Witmer Road Niagara Falls NY 14305 and our adress is 120, 5th cross, 1st main, domlur, Bangalore 50071 nad 420, Fanboy Lane, NewYark, AS 12345)

You could try a regex a bit more like this: 您可以尝试使用正则表达式,如下所示:

"(\\b[0-9]{3,5},? [A-Za-z]+(?: [A-Za-z]+,?)* [a-zA-Z]{2} [0-9]{5})"

The [A-Za-z]+,? [A-Za-z]+,? part allows only letters (and not numbers). 部分仅允许使用字母(而不允许使用数字)。

regex101 demo . regex101演示

The * operator is greedy, so it matches as many characters as it can. *运算符是贪婪的,因此它会匹配尽可能多的字符。 In your expression, the [a-zA-Z]{2} [0-9]{5} part that matches the zip code and state matches the very last ZIP and state in the input, because the .* patterns you have earlier in the expression, expand to as many characters as they can. 在您的表达式中,与邮政编码和状态匹配的[a-zA-Z] {2} [0-9] {5}部分与输入中的最后一个ZIP和状态匹配,因为您之前使用的。*模式在表达式中,将其扩展为尽可能多的字符。

Try changing the . 尝试更改. s to [^0-9] so that it matches anything except digits. s设置为[^0-9]以便匹配除数字以外的任何内容。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM