[英]htmlentities() returns empty values despite UTF-8
So I'm trying to escape a string in PHP using htmlentities()
. 所以我试图使用
htmlentities()
在PHP中转义一个字符串。
Problem is, htmlentities returns an empty string. 问题是,htmlentities返回一个空字符串。
I'm receiving this string through an html <form>
. 我正在通过html
<form>
接收此字符串。 The page containing the form tag has the following meta tag : <meta charset="utf-8">
包含form标记的页面具有以下meta标记:
<meta charset="utf-8">
My string is encoded in UTF-8, htmlentites()
third parameters is 'UTF-8'
and I still get an empty string. 我的字符串以UTF-8编码,
htmlentites()
第三个参数是'UTF-8'
,但我仍然得到一个空字符串。
Here is my code : 这是我的代码:
$str = strtolower(trim($str));
var_dump($str, mb_detect_encoding($str), htmlentities($str), htmlentities($str, ENT_COMPAT, 'UTF-8'), htmlentities($str, ENT_COMPAT, 'ISO-8859-1'));
And here is what var_dump displays : 这是var_dump显示的内容:
// Original string is é-è
// Expected output is é-è
string '�-�' (length=5) // Original string but why is the length 5 ?
string 'UTF-8' (length=5)
string '' (length=0)
string '' (length=0)
string 'ã©-ã¨' (length=28) // WTF ??
Anyone know where it's coming from ? 有人知道它从哪里来吗?
Ok I found out what was wrong. 好的,我发现了问题所在。
strtolower
is causing the problem. strtolower
引起了问题。
Please use mb_strtolower
请使用
mb_strtolower
var_dump($str, mb_detect_encoding($str), htmlentities($str), htmlentities($str, ENT_COMPAT, 'UTF-8'), htmlentities($str, ENT_COMPAT, 'ISO-8859-1'));
$str = trim($str);
var_dump($str, mb_detect_encoding($str), htmlentities($str), htmlentities($str, ENT_COMPAT, 'UTF-8'), htmlentities($str, ENT_COMPAT, 'ISO-8859-1'));
$str = strtolower($str);
var_dump($str, mb_detect_encoding($str), htmlentities($str), htmlentities($str, ENT_COMPAT, 'UTF-8'), htmlentities($str, ENT_COMPAT, 'ISO-8859-1'));
Here is the output : 这是输出:
// raw string é-è
string 'é-è' (length=5)
string 'UTF-8' (length=5)
string 'é-è' (length=17)
string 'é-è' (length=17)
string 'é-è' (length=28)
// trim('é-è')
string 'é-è' (length=5)
string 'UTF-8' (length=5)
string 'é-è' (length=17)
string 'é-è' (length=17)
string 'é-è' (length=28)
// strtolower('é-è')
string '�-�' (length=5)
string 'UTF-8' (length=5)
string '' (length=0)
string '' (length=0)
string 'ã©-ã¨' (length=28)
Somehow, strtolower()
seems to work only in 'ISO-8859-1', and as you can see in the var_dumps, it transforms Ã
不知何故,
strtolower()
似乎仅在“ ISO-8859-1”中有效,并且如您在var_dumps中所见,它转换Ã
into ã
进入
ã
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.