简体   繁体   English

“不正确的字符串值:”将UTF8文本插入到latin1列中时,MySQL问题

[英]“Incorrect string value:” MySQL issue when inserting UTF8 text into a latin1 column

I have this MySQL table in production that is of charset latin1_swedish_ci ( aka latin1 ) . 我在生产中有这个MySQL表,它的字符集为latin1_swedish_ci(又名latin1)。

Right now, there is this incoming content( String : "\한\밤\의" ) in a UTF-8 format that needs to be inserted into this TEXT column field called keywords in the table. 现在,以UTF-8格式存在此传入内容(字符串:“ \\ ud55c \\ ubc24 \\ uc758”),需要将其插入到此TEXT列字段中,称为表中的关键字

When I try to perform the INSERT, I get this error : 当我尝试执行INSERT时,出现此错误:

Incorrect string value: '\xED\x95\x9C\xEB\xB0\xA4...' for column 'keywords' at row 1

I have tried all kinds of ways in my Java code to try to convert from UTF8 to ISO-8859-1 like this below and I am still getting the same error : 我在Java代码中尝试了各种方法,尝试从UTF8转换为ISO-8859-1,如下所示,但我仍然遇到相同的错误:

String convertedString = new String(originalString.getBytes("UTF-8"), "ISO-8859-1");

I know there are solutions on StackOverflow that mentions to change the charset of the MySQL table to UTF8 from latin1, and I unfortunately cannot do that because this is a live production MySQL master server and also it has historically been using latin1. 我知道StackOverflow上有解决方案,其中提到将MySQL表的字符集从latin1更改为UTF8,但不幸的是我无法做到这一点,因为这是一个实时生产的MySQL主服务器,并且历史上一直在使用latin1。

Does anyone have any suggestions to fix this "Incorrect string value" error? 有人对解决此“字符串值不正确”错误有任何建议吗?

Thanks IS 谢谢IS

What you're trying to do simply isn't possible, unless the characters in the utf8 string also happen to have representations in latin1... and latin1 is a tiny single-byte character set (fewer than 256 possible characters, total), so the vast majority of valid utf8 characters have no equivalent latin1 representation. 除非utf8字符串中的字符也恰好在latin1中具有表示形式,并且latin1是一个很小的单字节字符集(总共少于256个字符),否则您将尝试做的事情根本不可能实现,因此绝大多数有效的utf8字符都没有等效的latin1表示形式。

You can't store any character in the column that the character set of the column doesn't support. 您不能在该列的字符集不支持的列中存储任何字符。 It's not a matter of "converting" from one to the other. 这不是从一个“转换”到另一个的问题。

If you need unicode, you need at least a utf8 column, and modifying the table is the only alternative. 如果需要unicode,则至少需要utf8列,并且修改表是唯一的选择。 Trying to do otherwise is like trying to store a negative number in an unsigned integer column. 否则尝试就像在无符号整数列中存储负数一样。 Unsigned ints can't be negative -- it's not a matter of conversion. 无符号的整数不能为负-这与转换无关。

This would be true of any RDBMS that supports character data types, and is not a limitation specific to MySQL. 对于任何支持字符数据类型的RDBMS都是如此,而不是特定于MySQL的限制。

한밤 is the Mojibake for 한밤 -- that is where it got converted to latin1 at some stage. 한밤한밤 -在某个阶段它被转换为latin1。 But \한\밤 is Unicode. 但是\한\밤是Unicode。 What mode is Python in? Python处于哪种模式? Do you have this at the beginning? 一开始有这个吗?

# -*- coding: utf-8 -*- 

More Python checklist . 更多Python检查清单

More 更多

utf8 is preferred; utf8是首选; euckr is possible. euckr是可能的。 But... The problem is not in picking the character set, it is in being consistent throughout the application in specifying that character set. 但是...问题不在于挑选字符集,而是在于在整个应用程序中指定该字符集的一致性。

Are you using Python? 您在使用Python吗? It is tagged Java? 它被标记为Java吗?

For Java/JDBC, you need ?useUnicode=yes&characterEncoding=UTF-8 in the getConnection() call. 对于Java / JDBC,在getConnection()调用中需要?useUnicode=yes&characterEncoding=UTF-8

You need these: 您需要这些:

  • The bytes in your client need to be utf8, such as hex ED959C . 客户端中的字节必须是utf8,例如ED959C十六进制。 (Korean characters are all 3 bytes in utf8.) (utf8中的韩文字符均为3个字节。)
  • The connection between the client and the server needs to be utf8. 客户端和服务器之间的连接需要为utf8。 Performing SET NAMES utf8 right after connecting is another way to do that. 连接后立即执行SET NAMES utf8是另一种方法。
  • The column/table needs to be CHARACTER SET utf8 . 列/表必须是CHARACTER SET utf8
  • If you are using html, it will need <meta charset=UTF-8> . 如果您使用的是html,则需要<meta charset=UTF-8>

For Korean, utf8mb4 is as good as utf8 . 对于韩语, utf8mb4utf8一样好。 Check those 4 bullet items above, and 'prove' to us that you are doing all of them. 检查上面的4个项目符号,并向我们“证明”您正在做所有这些项目。

For JSP and Java Servlets, slightly different advice is warranted. 对于JSP和Java Servlet,需要略有不同的建议

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM