简体   繁体   English

Java与Objective-c UTF-8编码

[英]Java vs Objective-c UTF-8 Encoding

I need to translate java code to objective-c, but I'm stuck in a string to byte array conversion. 我需要将Java代码转换为Objective-C,但我陷入了字符串到字节数组转换的困境。

In java I have: 在Java中,我有:

String Key="1234567890";
byte[] xKey = Key.getBytes();
System.out.println(Arrays.toString(xKey));

And it prints: 它打印:

[49, 50, 51, 52, 53, 54, 55, 56, 57, 48]

And in objective-c I have: 在Objective-C中,我有:

NSString *Key = @"1234567890";
(1) NSData * xKey = [key dataUsingEncoding:NSUTF8StringEncoding];
(2) NSLog(@"%@", xKey);

and it prints: 它打印:

<31323334 3536373839 30>

In (1) I've used: 在(1)中,我使用了:

const char * xKey = [Key UTF8String];

and in (2) I've used: 在(2)中,我使用了:

NSLog(@"%@s", xKey);

In UTF-8 48 corresponds to 0. 在UTF-8中,48对应于0。

The correct UTF-8 encoded form of the String "1234567890" is simply the codes of its characters because all are encoded using 1 byte (their codes are less than 127): String "1234567890"的正确UTF-8编码形式只是其字符的代码,因为所有字符均使用1字节编码(它们的代码小于127):

[49, 50, 51, 52, 53, 54, 55, 56, 57, 48]

Note: Arrays.toString(byte[]) uses the decimal representation of the bytes to construct the String representation of the array. 注意: Arrays.toString(byte[])使用Arrays.toString(byte[])的十进制表示形式来构造数组的String表示形式。

If you look closer, the Objective-c result is exactly the same , it's just printed in hexadecimal radix: 如果仔细观察,则Objective-c结果完全相同 ,只是以十六进制基数打印:

<31323334 3536373839 30>

0x31 = 49 0x31 = 49
0x32 = 50 0x32 = 50
... ...
0x30 = 48 0x30 = 48

By the way String.getBytes() qutoing from the javadoc: 通过String.getBytes String.getBytes()从javadoc退出:

Encodes this String into a sequence of bytes using the platform's default charset , storing the result into a new byte array. 使用平台的默认字符集将此String编码为字节序列,并将结果存储到新的字节数组中。

So the platform's default encoding is used, so the result may change from platform-to-platform, so it's always recommended to state in which encoding you want the result, eg: 因此,使用平台的默认编码,因此结果可能会因平台而异,因此始终建议您声明要使用哪种编码,例如:

byte[] xKey = Key.getBytes(StandardCharsets.UTF_8);

采用:

byte[] xKey = Key.getBytes(StandardCharsets.UTF_8);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM