简体   繁体   中英

Convert to utf8 two byte encoded data PHP

I have some data in a database, showing as the below:

øåñÉé

Judging from this ø should be a Ÿ . I'm not sure of a few things, but so far my research seems to be pointing toward the fact that these are encoded using two byte UTF8, but are showing as single bytes, hence one character (Ÿ) shows as two (à and ¸).

So how do I convert it? At the moment I have tried the following:

$text = "øåñÉé"; 
echo "Original: " . $text . "<br/>";
echo "iconv detect: " . iconv(mb_detect_encoding($text, mb_detect_order(), true), "UTF-8", $text) . "<br/>";
echo "ASCII convert: " . iconv('ASCII', 'UTF-8//IGNORE', $text) . "<br/>";  
echo "MB Convert: " . mb_convert_encoding($text, "UTF-8", "iso-8859-1") . "<br/>";  

// Wrong way around?

echo "ASCII convert: " . iconv('UTF-8', 'ASCII//IGNORE', $text) . "<br/>";  
echo "MB Convert: " . mb_convert_encoding($text, "iso-8859-1", "UTF-8") . "<br/>";  

Original: øåñÉé

iconv detect: øåñÉé

ASCII convert:

MB Convert: øÃ¥ñÃâ°Ã©

ASCII convert:

MB Convert: øåñ ?é

Its worth noting that this is just for the special characters, all of abcdefghijkl.... are all fine, its just accented and special characters that are going insane.

Ah, I have it – but in case any one in future needs it:

$text = "Jørgen Furøy Håkansson Sahlén";

echo "Original: ". $text . "<br/>";
echo "Windows iconv: " . iconv("UTF-8","Windows-1252",$text) . "<br/>"; 

Gives:

Original: Jørgen Furøy Håkansson Sahlén
Windows iconv: JørgenFurøy Håkansson Sahlén

So its the all important Windows-1252 :

iconv("UTF-8","Windows-1252",$text)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM