简体   繁体   中英

How to Encode URL Contains Unicode Characters with PHP

Currently, I'm trying to look for a solution to encode url which contains unicode characters, Khmer Unicode. I've tried using php built-in function urlencode() and it gives result: For example: http://www.example.com/?kwd=Mac+Book+Pro +នៅប្រទេសយើង

While I've tested with Google search, it results: https://www.google.com.kh/#hl=en&sclient=psy-ab&q=Mac+Book+Pro+%E1%9E%93%E1%9F%85%E1%9E%94%E1%9F%92%E1%9E%9A%E1%9E%91%E1%9F%81%E1%9E%9F%E1%9E%99%E1%9E%BE%E1%9E%84&oq=Mac+Book+Pro+%E1%9E%93%E1%9F%85%E1%9E%94%E1%9F%92%E1%9E%9A%E1%9E%91%E1%9F%81%E1%9E%9F%E1%9E%99%E1%9E%BE%E1%9E%84

How to do that? Hope someone here would help me. Thanks in advance!

For UTF-8 you can use:

urlencode(utf8_encode($string)); //for encoding

utf8_decode(urldecode($string)); //for decoding

For UTF-16 you can use this function (from notes for urlencode in http://php.net/urlencode ):

function utf16_urlencode ( $str ) {
     # convert characters > 255 into HTML entities
     $convmap = array( 0xFF, 0x2FFFF, 0, 0xFFFF );
     $str = mb_encode_numericentity( $str, $convmap, "UTF-8");

     # escape HTML entities, so they are not urlencoded
     $str = preg_replace( '/&#([0-9a-fA-F]{2,5});/i', 'mark\\1mark', $str );
     $str = urlencode($str);

     # now convert escaped entities into unicode url syntax
     $str = preg_replace( '/mark([0-9a-fA-F]{2,5})mark/i', '%u\\1', $str );
     return $str;
 }
function cleanUrl($url) {
    $res= urlencode(utf8_encode($url));
    $res = str_replace("%3A",":",$res);
    $res = str_replace("%2F","/",$res);
    return $res;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM