简体   繁体   English

PHP / MySQL有编码问题

[英]PHP/MySQL with encoding problems

I am having trouble with PHP regarding encoding. 我在编写PHP时遇到麻烦。

I have a JavaScript/jQuery HTML5 page interact with my PHP script using $.post. 我有一个JavaScript / jQuery HTML5页面使用$ .post与我的PHP脚本交互。 However, PHP is facing a weird problem, probably related to encoding. 但是,PHP面临着一个奇怪的问题,可能与编码有关。

When I write 当我写作

htmlentities("í")

I expect PHP to output í 我希望PHP输出í . However, instead it outputs í 然而,它输出í At the beginning, I thought that I was making some mistake with the encodings, however 一开始,我认为我在编码方面犯了一些错误

htmlentities("í")=="í"?"Good":"Fail";

is outputing "Fail", where 正在输出“失败”,其中

htmlentities("í")=="í"?"Good":"Fail";

But htmlentities($search, null, "utf-8") works as expected. 但是htmlentities($search, null, "utf-8")按预期工作。

I want to have PHP communicate with a MySQL server, but it has encoding problems too, even if I use utf8_encode. 我想让PHP与MySQL服务器通信,但它也有编码问题,即使我使用utf8_encode。 What should I do? 我该怎么办?

EDIT: On the SQL command, writing 编辑:在SQL命令,写

SELECT id,uid,type,value FROM users,profile
WHERE uid=id AND type='name' AND value='XXX';

where XXX contains no í chars, works as expected, but it does not if there is any 'í' char. 其中XXX不包含任何字符,按预期工作,但如果有任何'í'字符则不然。

SET NAMES 'utf8';
SET CHARACTER SET 'utf8';
SELECT id,uid,type,value FROM users,profile
WHERE uid=id AND type='name' AND value='XXX';

Not only fails for í chars, but it ALSO fails for strings without any 'special' characters. 不仅没有íchars,但它也没有任何“特殊”字符的字符串失败。 Removing the ' chars from SET NAMES and SET CHARACTER SET doesn't seem to change anything. 从SET NAMES和SET CHARACTER SET中删除'字符似乎没有任何改变。

I am connecting to the MySQL database using PDO. 我使用PDO连接到MySQL数据库。

EDIT 2: I am using MySQL version 5.1.30 of XAMPP for Linux. 编辑2:我正在使用XAMPP for Linux的MySQL版本5.1.30。

EDIT 3: Running SHOW VARIABLES LIKE '%character%' from PhpMyAdmin outputs 编辑3:从PhpMyAdmin输出运行SHOW VARIABLES LIKE '%character%'

character_set_client    utf8
character_set_connection    utf8
character_set_database  latin1
character_set_filesystem    binary
character_set_results   utf8
character_set_server    latin1
character_set_system    utf8
character_sets_dir  /opt/lampp/share/mysql/charsets/

Running the same query from my PHP script(with print_r) outputs: 从我的PHP脚本(使用print_r)运行相同的查询输出:

Array
(
    [0] => Array
        (
            [Variable_name] => character_set_client
            [0] => character_set_client
            [Value] => latin1
            [1] => latin1
        )

    [1] => Array
        (
            [Variable_name] => character_set_connection
            [0] => character_set_connection
            [Value] => latin1
            [1] => latin1
        )

    [2] => Array
        (
            [Variable_name] => character_set_database
            [0] => character_set_database
            [Value] => latin1
            [1] => latin1
        )

    [3] => Array
        (
            [Variable_name] => character_set_filesystem
            [0] => character_set_filesystem
            [Value] => binary
            [1] => binary
        )

    [4] => Array
        (
            [Variable_name] => character_set_results
            [0] => character_set_results
            [Value] => latin1
            [1] => latin1
        )

    [5] => Array
        (
            [Variable_name] => character_set_server
            [0] => character_set_server
            [Value] => latin1
            [1] => latin1
        )

    [6] => Array
        (
            [Variable_name] => character_set_system
            [0] => character_set_system
            [Value] => utf8
            [1] => utf8
        )

    [7] => Array
        (
            [Variable_name] => character_sets_dir
            [0] => character_sets_dir
            [Value] => /opt/lampp/share/mysql/charsets/
            [1] => /opt/lampp/share/mysql/charsets/
        )

)

Running 运行

SET NAMES 'utf8';
SET CHARACTER SET 'utf8';
SHOW VARIABLES LIKE '%character%'

outputs an empty array. 输出一个空数组。

It's very important to specify the encoding of htmlentities to match that of the input, as you did in your final example but omitted in the first three. 指定htmlentities的编码以匹配输入的编码非常重要,正如您在最后一个示例中所做的那样但在前三个中省略了。

htmlentities($text,ENT_COMPAT,'utf-8');

Regarding communications with MySQL, you need to make sure the connection collation and character set matches the data you are transmitting. 关于与MySQL的通信,您需要确保连接排序规则和字符集与您正在传输的数据匹配。 You can either set this in the configuration file, or at runtime using the following queries: 您可以在配置文件中设置它,也可以在运行时使用以下查询设置:

SET NAMES utf8;
SET CHARACTER SET utf8;

Make sure the table, database and server character sets match as well. 确保表,数据库和服务器字符集也匹配。 There is one setting you can't change at run-time, and that's the server's character set. 有一个设置在运行时无法更改,这是服务器的字符集。 You need to modify it in the configuration file: 您需要在配置文件中修改它:

[mysqld]
character-set-server = utf8
default-character-set = utf8 
skip-character-set-client-handshake

Read more on characters sets and collations in MySQL in the manual . 阅读手册中有关MySQL 字符集和排序规则的更多信息。

Late revival. 后期复兴。 But for further reference here are some extra tips: 但为了进一步参考,这里有一些额外的提示:

  1. Use mysql_set_charset instead of SET xxx 使用mysql_set_charset而不是SET xxx
  2. Make sure you are saving the file with UTF-8 encoding (this is often overlooked) 确保使用UTF-8编码保存文件(这经常被忽略)
  3. Set headers: 设置标题:
    <?php header("Content-type: text/html; charset=utf-8"); ?>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

  4. If your Apache server configuration contains a AddDefaultCharset directive with a different encoding go yell at your host administrator. 如果您的Apache服务器配置包含具有不同编码的AddDefaultCharset指令,请向主机管理员大喊大叫。

I just ran into this issue. 我刚遇到这个问题。 I have a whole website's content in Spanish, with all the special characters you can expect (áéíóúñ) and their capital letter versions. 我有一整个网站的西班牙语内容,包含您可以期待的所有特殊字符(áéíóúñ)和大写字母版本。

In my case it was an inconsistency with the server charset/collation. 在我的情况下,它与服务器charset / collat​​ion不一致。 Everything else was set to utf8, but the server charset, which had latin1. 其他所有设置为utf8,但服务器charset,其中有latin1。 This caused all utf8 data entered in the database to display in its raw encoded form, likeL í would equal an A with tilde ~ ... 这导致在数据库中输入的所有utf8数据以其原始编码形式显示,如Lí将等于A与波浪号〜...

I am using mysqli, and to fix it, I made use of the method explained above by Anthony Accioly (using mysql_set_charset). 我正在使用mysqli,为了修复它,我使用了Anthony Accioly上面解释的方法(使用mysql_set_charset)。 Said method has a mysqli version and that is what I used. 所述方法有一个mysqli版本,这就是我使用的。

After that, I was puzzled. 在那之后,我感到困惑。 I still had a mess when viewing my website. 查看我的网站时,我仍然一团糟。 Of course, I didn't know that by changing that latin1 to utf8 I would also mess up the character encode/decode of the whole thing. 当然,我不知道通过将latin1更改为utf8,我也会搞砸整个事物的字符编码/解码。 So I used the help of an online string encoder/decoder to fix my table data. 所以我使用在线字符串编码器/解码器的帮助来修复我的表数据。

I made various exports of all my content data (you can set them up to get update queries and that will be faster for your update process) and ran the sql output through the afore mentioned online encoder/decoder, then copy pasted the fixed queries on phpmyadmin sql panel... thus fixing my encoding errors. 我对我的所有内容数据进行了各种导出(您可以设置它们以获取更新查询,并且更新过程会更快)并通过前面提到的在线编码器/解码器运行sql输出,然后复制粘贴固定查询phpmyadmin sql panel ...从而修复我的编码错误。 Everything is now how it should be, AND I am able to process lossy searches again: Maria, maria, maría, mariá will all match maría, maria, Maria, etc. All acute characters evaluate to their base vowel character. 现在一切都应该如此,我能够再次处理有损搜索:Maria,maria,maría,mariá将全部匹配maría,maria,Maria等。所有锐角字符都评估为他们的基本元音字符。 Epic Win. 大胜。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM