简体   繁体   English

完全相同的 MySQL 表结构/数据,相同查询的不同结果

[英]Exact Same MySQL Table Structure/Data, Different Result To Same Query

Here's my situation.这是我的情况。

I'm migrating from one server to another.我正在从一台服务器迁移到另一台服务器。 As part of this, I'm moving across database data.作为其中的一部分,我正在跨数据库数据移动。

The migration method involved running the same CREATE TABLE query on the new server, then using a series of INSERT commands to insert the data row by row.迁移方法涉及在新服务器上运行相同的 CREATE TABLE 查询,然后使用一系列 INSERT 命令逐行插入数据。 It's possible this resulted in different data, however, the CHECKSUM command was used to validate the contents.这可能会导致不同的数据,但是,CHECKSUM 命令用于验证内容。 CHECKSUM was done on the whole table after the transfer, on a new table with that row isolated, and after truncation of the string by applying the LEFT operator. CHECKSUM 是在传输之后对整个表、在隔离了该行的新表上以及在应用 LEFT 运算符截断字符串之后进行的。 Every time, the result was identical between the old and new server, indicating the raw data should be exactly identical at the byte level.每次,新旧服务器的结果都是相同的,这表明原始数据在字节级别上应该完全相同。

CHECKSUM TABLE `test`

I've checked the structure and it's exactly the same as well.我检查了结构,它也完全一样。

SHOW CREATE TABLE `test`

Here is the structure:这是结构:

CREATE TABLE test ( item varchar(32) COLLATE utf8_unicode_ci NOT NULL, amount mediumint(5) NOT NULL ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

The field is of type:该字段的类型为:

`item` varchar(32) COLLATE utf8_unicode_ci NOT NULL

Here is my connection code in PHP:这是我在 PHP 中的连接代码:

$sql = new mysqli($db_host, $db_user, $db_pass, $db_name);
if ($sql->connect_error) {
  die('Connect Error ('.$sql->connect_errno.') '.$sql->connect_error);
}

When I go to retrieve the data in PHP with a simple query:当我用一个简单的查询 go 检索 PHP 中的数据时:

SELECT * FROM `test`

The data displays like this:数据显示如下:

§lO §10

On the old server/host, I get this sequence of raw bytes:在旧服务器/主机上,我得到以下原始字节序列:

Decimal: -194-167-108-79-
HEX: -C2-A7-6C-4F-

And on the new server, I get a couple of extra bytes at the beginning:在新服务器上,我在开头得到了几个额外的字节:

Decimal: -195-130-194-167-108-79-
HEX: -C3-82-C2-A7-6C-4F-

Why might the exact same raw data, table structure, and query, return a different result between the two servers?为什么完全相同的原始数据、表结构和查询会在两台服务器之间返回不同的结果? What should I do to ensure that results are as consistent as possible in the future?我应该怎么做才能确保结果在未来尽可能一致?

§lO is "Mojibake" for §lO . §lO是 §lO 的“ §lO ”。 I presume the latter (3-character) is "correct"?我认为后者(3 个字符)是“正确的”?

The raw data looks like this (in both cases when I display it)原始数据看起来像这样(在两种情况下,当我显示它时)

is bogus because the technique used for displaying it probably messed with the encoding.是伪造的,因为用于显示它的技术可能与编码混淆。

Since the 3 characters became 4 and then became 6, you probably have "double-encoding".由于 3 个字符变成 4,然后变成 6,你可能有“双重编码”。

This discusses how "double encoding" can occur: Trouble with UTF-8 characters;这讨论了“双重编码”是如何发生的:UTF-8 字符出现问题; what I see is not what I stored 我看到的不是我存储的

If you provide some more info ( CREATE TABLE , hex, method of migrating the data, etc), we may be able to further unravel the mess you have.如果您提供更多信息( CREATE TABLE 、十六进制、迁移数据的方法等),我们也许能够进一步解开您的混乱局面。

More更多的

When using mysqli, do $sql->set_charset('utf8');使用 mysqli 时,执行$sql->set_charset('utf8');

(The HEX confirms my analysis.) (HEX 证实了我的分析。)

The migration method involved running the same CREATE TABLE query on the new server迁移方法涉及在新服务器上运行相同的 CREATE TABLE 查询

Was it preceded by some character set settings, as in mysqldump ?它之前是否有一些字符集设置,如mysqldump

then using a series of INSERT commands to insert the data row by row.然后使用一系列 INSERT 命令逐行插入数据。

Can you get the HEX of some accented character in the file?你能得到文件中一些重音字符的十六进制吗?

... CHECKSUM... ...校验和...

OK, being the same rules out one thing.好的,相同的排除了一件事。

CHECKSUM was done on... a new table with that row isolated CHECKSUM 是在...一个新表上完成的,该行被隔离

How did you do that?你是怎么做到的? SELECTing the row could have modified the text, thereby invalidating the test. SELECTing行可能会修改文本,从而使测试无效。

indicating the raw data should be exactly identical at the byte level.表明原始数据在字节级别应该完全相同。

For checking the data in the table, SELECT HEX(col)... is the only way to bypass all possible character set conversions that could happen.为了检查表中的数据, SELECT HEX(col)...是绕过所有可能发生的字符集转换的唯一方法。 Please provide the HEX for some column with a non-ascii character (such as the example given).请为具有非 ascii 字符的某些列提供 HEX(例如给出的示例)。 And do the CHECKSUM against the HEX output.并对 HEX output 进行校验和。

And provide SHOW VARIABLES LIKE 'char%';并提供SHOW VARIABLES LIKE 'char%';

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM