简体   繁体   中英

Java vs Objective-c UTF-8 Encoding

I need to translate java code to objective-c, but I'm stuck in a string to byte array conversion.

In java I have:

String Key="1234567890";
byte[] xKey = Key.getBytes();
System.out.println(Arrays.toString(xKey));

And it prints:

[49, 50, 51, 52, 53, 54, 55, 56, 57, 48]

And in objective-c I have:

NSString *Key = @"1234567890";
(1) NSData * xKey = [key dataUsingEncoding:NSUTF8StringEncoding];
(2) NSLog(@"%@", xKey);

and it prints:

<31323334 3536373839 30>

In (1) I've used:

const char * xKey = [Key UTF8String];

and in (2) I've used:

NSLog(@"%@s", xKey);

In UTF-8 48 corresponds to 0.

The correct UTF-8 encoded form of the String "1234567890" is simply the codes of its characters because all are encoded using 1 byte (their codes are less than 127):

[49, 50, 51, 52, 53, 54, 55, 56, 57, 48]

Note: Arrays.toString(byte[]) uses the decimal representation of the bytes to construct the String representation of the array.

If you look closer, the Objective-c result is exactly the same , it's just printed in hexadecimal radix:

<31323334 3536373839 30>

0x31 = 49
0x32 = 50
...
0x30 = 48

By the way String.getBytes() qutoing from the javadoc:

Encodes this String into a sequence of bytes using the platform's default charset , storing the result into a new byte array.

So the platform's default encoding is used, so the result may change from platform-to-platform, so it's always recommended to state in which encoding you want the result, eg:

byte[] xKey = Key.getBytes(StandardCharsets.UTF_8);

采用:

byte[] xKey = Key.getBytes(StandardCharsets.UTF_8);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM