简体   繁体   English

MySQL到JSON:UTF-8中德国特殊字符编码的问题

[英]MySQL to JSON: Issue with encoding of German special characters in UTF-8

I wrote a little script that takes the data from a MySQL table and puts it into a JSON array. 我编写了一个小脚本,该脚本从MySQL表获取数据并将其放入JSON数组。 However, there's an issue with character encoding, even though I have set UTF-8 everywhere. 但是,即使我到处都设置了UTF-8,字符编码还是有问题。 Here's the script: 这是脚本:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="utf-8">
    <title>JSON</title>
</head>

<?php

header('Content-type: text/html; charset=UTF-8');

$con = mysqli_connect("HOST", "USERNAME", "PASSWORD", "DATABASE");
if (!$con) {
    trigger_error('Could not connect to MySQL: ' . mysqli_connect_error());
}

mysqli_set_charset($con,"utf8");

mysql_query("SET NAMES SET 'utf8'"); 
mysql_query("SET character_set_client = 'utf8'"); 
mysql_query("SET character_set_connection = 'utf8'"); 
mysql_query("SET character_set_results = 'utf8'");

$sql = "SELECT * FROM table";

$result = mysqli_query($con, $sql);

$rows = array();
while($r = mysqli_fetch_assoc($result)) {
    $rows[]=$r;
}

print json_encode($rows);


mysqli_close($con);

?>

</html>

In the output, I get the value "\ä" instead of "ä". 在输出中,我得到的值是“ \\ u00e4”而不是“ä”。

Some additional info: 一些其他信息:

  • Table is in utf8_general_ci (as are all of its columns) 表位于utf8_general_ci中(及其所有列一样)
  • PHP document is in UTF8 PHP文档在UTF8中

What am I doing wrong? 我究竟做错了什么? Thanks for your help!! 谢谢你的帮助!!

Looks to me everything is working properly. 在我看来,一切工作正常。 The reason why you see instead of ä is because of the implementation of the JSON serializer. 之所以看到而不是ä是因为JSON序列化程序的实现。 What the serializer is doing is perfectly valid. 序列化器正在执行的操作完全正确。

From the JSON RFC Section 2.5 Strings 来自JSON RFC第2.5节字符串

Any character may be escaped. 任何字符都可以转义。 If the character is in the Basic Multilingual Plane (U+0000 through U+FFFF), then it may be represented as a six-character sequence: a reverse solidus, followed by the lowercase letter u, followed by four hexadecimal digits that encode the character's code point. 如果字符在基本多语言平面中(U + 0000至U + FFFF),则可以表示为六个字符的序列:反向固线,后跟小写字母u,后跟四个编码十六进制数字字符的代码点。 The hexadecimal letters A though F can be upper or lowercase. 十六进制字母A至F可以为大写或小写。 So, for example, a string containing only a single reverse solidus character may be represented as "\\". 因此,例如,仅包含单个反斜线字符的字符串可以表示为“ \\ u005C”。

The reason I suspect why this serializer escapes it for you is because PHP doesn't natively support unicode . 我怀疑为什么此序列化程序会为您转义,原因是因为PHP本身不支持unicode

A string is series of characters, where a character is the same as a byte. 字符串是一系列字符,其中一个字符与一个字节相同。 This means that PHP only supports a 256-character set, and hence does not offer native Unicode support. 这意味着PHP仅支持256个字符的集合,因此不提供本机Unicode支持。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM