简体   繁体   中英

json_decode & file_get_contents doesn't get the UTF8 characters

I use

$link = json_decode(file_get_contents("http://graph.facebook.com/111866602162732"));

the result on that page shows:

 "name": "L\u00e9ry, Quebec",

I then want to convert that with the accents.. like this:

$location_name = $link->name;
echo 'NAME ORIGINAL: '.$location_name;
$location_name = preg_replace('/\\\\u([0-9a-fA-F]{4})/', '&#x\1;', $location_name); // convert to UTF8
echo '  NAME after: '.$location_name;

I get the following result:

  NAME ORIGINAL: Léry, Quebec     NAME after: Léry, Quebec

my preg_replace is correct, so it's the original name that is being transformed by the file_get_contents.

If file_get_contents don't give you back a well format UTF-8 text, then json_decode you would return NULL. Json MUST be in UTF-8 encoding.

This function only works with UTF-8 encoded strings. ( json_decode )

So, I guess that you're reading the data with another encoding. Check it out.

Most likely, you're treating a valid UTF-8 output given to you by json_decode as ISO-8859-1 See here, for example: http://www.i18nqa.com/debug/bug-utf-8-latin1.html

Make sure that you're treating your debug output as UTF-8 - that should solve the problem.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM