简体   繁体   English

PHP UTF-8 mb_convert_encode和Internet Explorer

[英]PHP UTF-8 mb_convert_encode and Internet-Explorer

Since some days I read about Character-Encoding, I want to make all my Pages with UTF-8 for Compability. 由于有几天我阅读了有关字符编码的信息,所以我希望所有页面都使用UTF-8进行兼容。 But I get stuck when I try to convert User-Input to UTF-8, this works on all Browsers, expect Internet-Explorer (like always). 但是,当我尝试将User-Input转换为UTF-8时,我陷入了困境,这在所有浏览器上都有效,并且期望Internet Explorer(总是如此)。

I don't know whats wrong with my code, it seems fine to me. 我不知道我的代码有什么问题,对我来说似乎很好。

  • I set the header with char encoding 我用char编码设置标题
  • I saved the file in UTF-8 (No BOM) 我将文件保存为UTF-8(无BOM)

This happens only, if you try to access to the page via $_GET on the internet-Explorer myscript.php?c=äüöß When I write down specialchars on my site, they would displayed correct. 仅当您尝试通过Internet-Explorer myscript.php?c =äüöß上的$ _GET访问该页面时,才在我的网站上写下特殊字符时,它们会显示正确。

This is my Code: 这是我的代码:

// User Input
$_GET['c'] = "äüöß"; // Access URL ?c=äüöß
//--------
header("Content-Type: text/html; charset=utf-8");
mb_internal_encoding('UTF-8');

$_GET = userToUtf8($_GET);

function userToUtf8($string) {
    if(is_array($string)) {
        $tmp = array();
        foreach($string as $key => $value) {
            $tmp[$key] = userToUtf8($value);
        }
        return $tmp;
    }

    return userDataUtf8($string);
}

function userDataUtf8($string) {
    print("1: " . mb_detect_encoding($string) . "<br>"); // Shows: 1: UTF-8
    $string = mb_convert_encoding($string, 'UTF-8', mb_detect_encoding($string)); // Convert non UTF-8 String to UTF-8
    print("2: " . mb_detect_encoding($string) . "<br>"); // Shows: 2: ASCII
    $string = preg_replace('/[\xF0-\xF7].../s', '', $string);
    print("3: " . mb_detect_encoding($string) . "<br>"); // Shows: 3: ASCII

    return $string;
}
echo $_GET['c']; // Shows nothing
echo mb_detect_encoding($_GET['c']); // ASCII
echo "äöü+#"; // Shows "äöü+#"

The most confusing Part is, that it shows me, that's converted from UTF-8 to ASCII... Can someone tell me why it doesn't show me the specialchars correctly, whats wrong here? 最令人困惑的部分是,它向我显示了它是从UTF-8转换为ASCII ...有人可以告诉我为什么它不能正确地向我显示特殊字符,这是怎么了? Or is this a Bug on the Internet-Explorer? 还是这是Internet Explorer上的Bug?

Edit: If I disable converting it says, it's all UTF-8 but the Characters won't show to me either... They are displayed like "????".... 编辑:如果我禁用转换说,这都是UTF-8,但是这些字符也不会显示给我...它们显示为“ ????”。

Note: This happens ONLY in the Internet-Explorer! 注意:这仅在Internet Explorer中发生!

Although I prefer using urlencoded strings in address bar but for your case you can try to encode $_GET['c'] to utf8. 尽管我更喜欢在地址栏中使用urlencoded字符串,但是对于您而言,您可以尝试将$_GET['c']编码为utf8。 Eg. 例如。

$_GET['c'] = utf8_encode($_GET['c']);

An approach to display the characters using IE 11.0.18 which worked: 一种使用IE 11.0.18显示字符的方法,该方法有效:

  • Retrieve the Unicode of your character : example for 'ü' = 'U+00FC' 检索字符的Unicode :例如“ü” =“ U + 00FC”

  • According to this post , convert it to utf8 entity 根据这篇文章 ,将其转换为utf8实体

  • Decode it using utf8_decode before dumping 转储前使用utf8_decode对其进行解码

The line of code illustrating the example with the 'ü' character is : 用'ü'字符说明示例的代码行是:

var_dump(utf8_decode(html_entity_decode(preg_replace("/U\+([0-9A-F]{4})/", "&#x\\1;", 'U+00FC'), ENT_NOQUOTES, 'UTF-8')));

To summarize: For displaying purposes, go from Unicode to UTF8 then decode it before displaying it. 总结:为了显示,从Unicode到UTF8,然后在显示之前对其进行解码。

Other resources: a post to retrieve characters' unicode 其他资源: 检索字符的unicode的帖子

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM