简体   繁体   English

为什么我的Java String的长度比生成它的byte []数组的长度短?

[英]Why is my java String shorter in length than the byte[] array it was generated from?

I'm reading a blob from a MySql database using JDBC. 我正在使用JDBC从MySql数据库读取blob。 I know the resulting byte array is good, I've sent it over HTTP as a string literal of numbers for each byte, and successfully downloaded the result (jpg). 我知道结果字节数组是好的,我已经通过HTTP将其作为数字字节字符串发送给每个字节,并成功下载了结果(jpg)。 (just to prove mysql -> java servlet data is good). (只是为了证明mysql-> java servlet数据是好的)。

Constructing a new string from this byte array using UTF-8 yields a string shorter in length than the byte array, and of values I can't decipher. 使用UTF-8从此字节数组构造一个新的字符串会产生一个长度比字节数组短的字符串,其值我无法解读。 If UTF-8 is AT LEAST 1 byte per character, shouldn't the resulting string be AT A MINIMUM the length of the byte array its generated from? 如果UTF-8是每个字符至少1个字节,那么结果字符串不应该是AT A MINIMUM从其生成的字节数组的长度吗? (for this particular example, byte length is 12,079,474 and resulting string length is 11,501,845) (对于此特定示例,字节长度为12,079,474,结果字符串长度为11,501,845)

Thanks for your time! 谢谢你的时间!

In your bytes, you have data that is interpreted as continuation bytes, ie in UTF-8 they have special meaning and they form one Unicode character from multiple bytes. 在您的字节中,您具有被解释为连续字节的数据,即在UTF-8中,它们具有特殊含义,并且它们从多个字节中形成一个Unicode字符。 That is why your string is shorter than the number of bytes. 这就是为什么您的字符串短于字节数的原因。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM