[英]Regex giving extra output in java
My code is like: 我的代码是这样的:
String try1 = " how abcd is a lake 3909 Witmer Road Niagara Falls NY 14305 and our adress is 120, 5th cross, 1st main, domlur, Bangalore 50071 nad 420, Fanboy Lane, NewYark, AS 12345";
String add1="( \\b+[0-9]{3,5}[, ]* (.*)[, ]* (.*)[, ]* [a-zA-Z]{2} [0-9]{5})";
Pattern p = Pattern.compile(add1);
Matcher m = p.matcher(try1);
if(m.find())
{
System.out.println("Address ======> " + m.group());
}
else System.out.println("Address ======>Not found ");
I want only US addresses in output: 我只想在输出中使用美国地址:
[(3909 Witmer Road Niagara Falls NY 14305) and (420, Fanboy Lane, NewYark, AS 12345)]
but it's outputting like this: 但它的输出是这样的:
(3909 Witmer Road Niagara Falls NY 14305 and our adress is 120, 5th cross, 1st main, domlur, Bangalore 50071 nad 420, Fanboy Lane, NewYark, AS 12345)
You could try a regex a bit more like this: 您可以尝试使用正则表达式,如下所示:
"(\\b[0-9]{3,5},? [A-Za-z]+(?: [A-Za-z]+,?)* [a-zA-Z]{2} [0-9]{5})"
The [A-Za-z]+,?
[A-Za-z]+,?
part allows only letters (and not numbers). 部分仅允许使用字母(而不允许使用数字)。
The * operator is greedy, so it matches as many characters as it can. *运算符是贪婪的,因此它会匹配尽可能多的字符。 In your expression, the [a-zA-Z]{2} [0-9]{5} part that matches the zip code and state matches the very last ZIP and state in the input, because the .* patterns you have earlier in the expression, expand to as many characters as they can.
在您的表达式中,与邮政编码和状态匹配的[a-zA-Z] {2} [0-9] {5}部分与输入中的最后一个ZIP和状态匹配,因为您之前使用的。*模式在表达式中,将其扩展为尽可能多的字符。
Try changing the .
尝试更改
.
s to [^0-9]
so that it matches anything except digits. s设置为
[^0-9]
以便匹配除数字以外的任何内容。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.