I want to convert html entities to UTF-8, but mb_convert_encoding
destroys already UTF-8 encoded characters. Whats the correct way?
$text = "äöü ä ö ü ß";
var_dump(mb_convert_encoding($text, 'UTF-8', 'HTML-ENTITIES'));
// string(24) "äöü ä ö ü ß"
mb_convert_encoding()
isn't the correct function for what you're trying to achieve: you should really be using html_entity_decode() instead, because it will only convert the actual html entities to UTF-8, and won't affect the existing UTF-8 characters in the string.
$text = "äöü ä ö ü ß";
var_dump(html_entity_decode($text, ENT_COMPAT | ENT_HTML401, 'UTF-8'));
which gives
string(18) "äöü ä ö ü ß"
In my localhost I get string(18) "äöü ä ö ü ß"
.
I think it's something related with your page encoding. Edit the file with Notepad++ and from the toolbar go to encoding and change to 'Encode in ANSI'. If it doesn't work then try with 'Encode in UTF-8 without BOM'.
and if that still isn't working try this
html_entity_decode($html, ENT_QUOTES, 'cp1252');
This is what was needed on a Windows IIS system for things to start working correctly. see source
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.