简体   繁体   English

正则表达式:标识不同格式的电话号码

[英]Regex: Identify phone numbers in different formats

I have a website that people write jokes there. 我有一个网站,人们在那里写笑话。 users can send jokes they like to their (or their friends) phones as sms. 用户可以将喜欢的笑话作为短信发送到他们(或他们的朋友)手机。 and the sender of the joke (who has added the joke to the site) displayed below it: 笑话的发送者(将笑话添加到站点的人)显示在下面:

Joke #12234 笑话#12234

this is the body 这是身体
of the joke 开玩笑的

sender: John 发件人:John

some times people use their phone numbers as the sender name, that is not allowed in public. 有时人们将其电话号码用作发件人姓名,这在公共场合是不允许的。 I want to determine if there is a phone number in the sender name to be able to censor it. 我想确定发件人姓名中是否有电话号码可以进行审查。 I assume any number bigger than 6 digits as a phone number. 我假设电话号码大于6位数字。 but the problem is that user might separate the numbers like: 但是问题是用户可能会像这样分隔数字:

1234567890 should become 1234XXX7890 1234567890应该变成1234XXX7890
123 456 7890 should become 123 XXX 7890 123 456 7890应该变成123 XXX 7890
123-456-7890 123-456-7890
123456-7890 123456-7890


and so on. 等等。 any of the forms similar to above formats should be censored. 与上述格式相似的任何形式都应进行审查。 I tried removing non numeric characters and then use regular expressions but the problem is then it also fetches: 我试着删除非数字字符,然后使用正则表达式,但问题是它也提取了:

john23 peterson12345 约翰·约翰·彼得森12345

can anyone offer a better way? 谁能提供更好的方法?

To keep all the formatting, replace 要保留所有格式,请替换

(\d{3}[-\s()]*)\d{3}([-\s()]*\d{4})

with

$1XXX$2

To further constrain for 10-digit numbers ( ie disallow additional numbers immediately before and after), use negative lookaround assertions: 要进一步限制10位数字( 在紧随其前和之后禁止使用其他数字),请使用否定环顾断言:

(?<!\d)(\d{3}[-\s()]*)\d{3}([-\s()]*\d{4})(?!\d)
^^^^^^^                                   ^^^^^^

Finally, what if typos lead users to insert space or symbols between groups, eg (123)45 6-7890 ? 最后,如果输入错误导致用户在组之间插入空格或符号,例如(123)45 6-7890怎么(123)45 6-7890 To catch these too, do the following: 要也捕获这些,请执行以下操作:

(?<!\d)((?:\d[-\s()]*){3})(?:\d[-\s()]*){3}((?:\d[-\s()]*){4})(?!\d)

This may, however, catch "too much," eg 1-2-3-4-5-6-7-8-9-0 . 但是,这可能会“太多”,例如1-2-3-4-5-6-7-8-9-0 You will have to determine what balance you want to strike. 您将必须确定要达到的平衡。

because of a lot of phone number format in this world, you can use this regex to identify any phone number. 因为在这个世界上有很多电话号码的格式,你可以使用这个正则表达式来识别的任何电话号码。 ^[0-9-+s()]*$

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM