[英]How to handle SQL state [HY000]; error code [1366]; Incorrect string value?
I'm aware this error means a mysql column doesn't accept the value, but this is strange, since the value fits in a Java UTF-8 encoded string, and the mysql column is utf8_general_ci. 我知道这个错误意味着mysql列不接受该值,但这很奇怪,因为该值适合Java UTF-8编码的字符串,而mysql列是utf8_general_ci。 Also, all utf8 characters have worked properly so far, apart from these.
此外,除了这些之外,所有utf8角色到目前为止都已正常工作。
The use-case is: I am importing tweets. 用例是:我正在导入推文。 The tweet in question is: https://twitter.com/bakervin/status/210054214951518212 - you can see the two "strange" characters (and two strange whitespaces between them).
有问题的推文是: https : //twitter.com/bakervin/status/210054214951518212 - 您可以看到两个“奇怪”的字符(以及它们之间的两个奇怪的空格)。 The question is - how to handle this:
问题是 - 如何处理:
These appear to be unicode surrogate characters . 这些似乎是unicode代理字符 。 Since they are not actual characters, and it seems MySQL doesn't support them, it is safe to trim them:
由于它们不是真正的字符,并且似乎MySQL不支持它们,因此可以安全地修剪它们:
StringBuilder sb = new StringBuilder();
for (int i = 0; i < text.length(); i++) {
char ch = text.charAt(i);
if (!Character.isHighSurrogate(ch) && !Character.isLowSurrogate(ch)) {
sb.append(ch);
}
}
return sb.toString();
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.