I'm trying to get the UTF-8 bytes (in decimal) of a unicode string. For instance:
function unicode_to_utf8_bytes($string) {
}
$text = 'Hello 😀';
$result = unicode_to_utf8_bytes($text);
var_dump($result);
array(10) {
[0]=>
int(72)
[1]=>
int(101)
[2]=>
int(108)
[3]=>
int(108)
[4]=>
int(111)
[5]=>
int(32)
[6]=>
int(240)
[7]=>
int(159)
[8]=>
int(152)
[9]=>
int(128)
}
An example of the result can be seen here:
http://apps.timwhitlock.info/unicode/inspect?s=Hello+%F0%9F%98%80
I feel I'm close, this is what I managed to get:
function utf8_char_code_at($str, $index) {
$char = mb_substr($str, $index, 1, 'UTF-8');
if (mb_check_encoding($char, 'UTF-8')) {
$ret = mb_convert_encoding($char, 'UTF-32BE', 'UTF-8');
return hexdec(bin2hex($ret));
}
else
return null;
}
function unicode_to_utf8_bytes($str) {
$result = array();
for ($i=0; $i<mb_strlen($str, '8bit'); $i++)
$result[] = utf8_char_code_at($str, $i);
return $result;
}
$string = 'Hello 😀';
var_dump(unicode_to_utf8_bytes($string));
array(10) {
[0]=>
int(72)
[1]=>
int(101)
[2]=>
int(108)
[3]=>
int(108)
[4]=>
int(111)
[5]=>
int(32)
[6]=>
int(128512)
[7]=>
int(0)
[8]=>
int(0)
[9]=>
int(0)
}
Any help will be much appreciated!
This gets the results you were looking for:
$string = 'Hello 😀';
var_export(ascii_to_dec($string));
function ascii_to_dec($str)
{
for ($i = 0, $j = strlen($str); $i < $j; $i++) {
$dec_array[] = ord($str{$i});
}
return $dec_array;
}
Results:
array (
0 => 72,
1 => 101,
2 => 108,
3 => 108,
4 => 111,
5 => 32,
6 => 240,
7 => 159,
8 => 152,
9 => 128,
)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.