简体   繁体   English

使用正则表达式捕获多个电话号码

[英]Capturing multiple phone numbers with regex

I'm trying to improve with regex as I'm tired of constantly having to look up existing solutions instead of creating my own. 我正在尝试使用正则表达式进行改进,因为我厌倦了经常不得不查找现有解决方案而不是创建自己的解决方案。 Having a bit of difficulty understanding why this isn't working though: 有点困难,但是为什么不能解决这个问题:

Trying to extract both phone numbers from the following string (numbers and address are random): 尝试从以下字符串中提取两个电话号码(数字和地址是随机的):

+1-541-754-3010 156 Alphand_St. <J Steeve>\n 133, Green, Rd. <E Kustur> NY-56423 ;+1-541-914-3010\n"

So I'm using the following expression: 所以我使用以下表达式:

 /\+(.+)(?:\s|\b)/

These are the matches I'm getting back: 这些是我回来的比赛:

  1. 1-541-754-3010 156 Alphand_St. 1-541-754-3010 156 Alphand_St。
  2. 1-541-914-3010 1-541-914-3010

So I'm getting the last one correctly, but not the first one. 所以我正确地得到了最后一个,但是没有正确地得到。 Based on the expression, it should match anything from between a + and a space/boundary. 根据表达式,它应该匹配+和空格/边界之间的任何内容。 But for some reason it's not stopping at the space after the first number. 但是由于某种原因,它并没有在第一个数字之后停下来。 Am I going about this the wrong way? 我会以错误的方式处理吗?

In the format you provided for the search string, and since you are starting with a literal "+", I would just include the next following string of decimals and separators, like the hyphen: 在您为搜索字符串提供的格式中,由于您以文字“ +”开头,因此,我只需要包含下一个十进制字符串和分隔符,例如连字符:

/\+([0-9\-]+)/

Your ".+" says to match everything until there's a \\s. 您的“。+”表示匹配所有内容,直到出现\\。 However that also includes \\s on the way to the \\s. 但是,在通往\\ s的途中还包括\\ s。

Remember that dashes - are not word characters, so \\b will match between, for example, 1- and -5 and so on. 请记住,破折号-不是单词字符,因此\\b将在1--5之间匹配,依此类推。 Also, your current regex is greedy - it'll try to match as many characters as it can with the repeated . 另外,您当前的正则表达式是贪婪的 -它将尝试使用repeat匹配尽可能多的字符. , which is why it goes all the way to the end of the first line (because after the last character in the line matches \\b ). ,这就是为什么它一直到第一行的末尾的原因(因为该行中的最后一个字符与\\b匹配)。 Making it lazy (with .+? ) wouldn't fix it, though, because then it would terminate right after the 1 in 1-541 (because between 1- is a word boundary) 但是,使其变得懒惰(使用.+? )将无法修复它,因为它会在1-5411之后终止(因为1-之间是单词边界)

Try using a character set of digits and - instead: 尝试使用数字字符集和-代替:

\+([\d-]+)

https://regex101.com/r/ktbcHJ/1 https://regex101.com/r/ktbcHJ/1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM