简体   繁体   中英

Standardize/Sanitize variably-formatted phone numbers to be purely 10-digit strings

Before I store user-supplied phone numbers in my database, I need to standatdize/sanitize the string to consist of exactly 10 digits.

I want to end up with 1112223333 from all of these potential input values:

(111)222-3333
111-222-3333
111.222.3333
+11112223333
11112223333

In the last two strings, there's a 1 as the country code.

I was able to make some progress with:

preg_replace('/\D/', '', mysqli_real_escape_string($conn, $_POST["phone"]));

Can anyone help me to fix up the strings that have more than 10 digits?

Using your preg_replace which got all but the last one. Next you count the length of the string and remove the first number if it's over 9 numbers.

preg_replace('/\D/', '', mysqli_real_escape_string($conn, $_POST["phone"]));

if(strlen($str) > 9){

$str = substr($str, 1);

}

If you want to parse phone numbers, a very useful library is giggsey/libphonenumber-for-php . It is based on Google's libphonenumber, it has also a demo online to show how it works

Do it in two passes:

$phone = [
'(111)222-3333',
'111-222-3333',
'111.222.3333',
'+11112223333',
'11112223333',
'+331234567890',
];

# remove non digit
$res = preg_replace('/\D+/', '', $phone);
# keep only 10 digit
$res = preg_replace('/^\d+(\d{10})$/', '$1', $res);
print_r($res);

Output:

Array
(
    [0] => 1112223333
    [1] => 1112223333
    [2] => 1112223333
    [3] => 1112223333
    [4] => 1112223333
    [5] => 1234567890
)

This task can/should be accomplished by making just one pass over the string to replace unwanted characters.

.*       #greedily match zero or more of any character
(\d{3})  #capture group 1
\D*      #greedily match zero or more non-digits
(\d{3})  #capture group 2
\D*      #greedily match zero or more non-digits
(\d{4})  #capture group 3
$        #match end of string

Matching the position of the end of the string ensures that the final 10 digits from the string are captured and any extra digits at the front of the string are ignored.

Code: ( Demo )

$strings = [
    '(111)222-3333',
    '111-222-3333',
    '111.222.3333',
    '+11112223333',
    '11112223333'
];

foreach ($strings as $string) {
    echo preg_replace(
             '/.*(\d{3})\D*(\d{3})\D*(\d{4})$/',
             '$1$2$3',
             $string
         ) . "\n---\n";
}

Output:

1112223333
---
1112223333
---
1112223333
---
1112223333
---
1112223333
---

The same result can be achieved by changing the third capture group to be a lookahead and only using two backreferences in the replacement string. ( Demo )

echo preg_replace(
         '/.*(\d{3})\D*(\d{3})\D*(?=\d{4}$)/',
         '$1$2',
         $string
     );

Finally, a much simpler pattern can be used to purge all non-digits, but this alone will not trim the string down to 10 characters. Calling substr() with a starting offset of -10 will ensure that the last 10 digits are preserved. ( Demo )

echo substr(preg_replace('/\D+/', '', $string), -10);

As a side note, you should use a prepared statement to interact with your database instead of relying on escaping which may have vulnerabilities.

Use str_replace with an array of the characters you want to remove.

$str = "(111)222-3333 111-222-3333 111.222.3333 +11112223333";

echo str_replace(["(", ")", "-", "+", "."], "", $str);

https://3v4l.org/80AWc

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM