can someone help me to figure this out ? about unicode

Question

hibyte  lobyte  makeunicode
 250     65      57345

I got this table, and the hibyte and lobyte are some chinese character which may use big5 or GBK encoding, hibyte is hight byte, and lobyte is low byte.

And I think the unicode might be some encoding in unicode that corresponding to the big5/GBK character with the hibyte and lobyte.

But after i try to display, they display different character, there must be some problem, can some one help me ?

Answer 1

I don't really understand what you want, but from your high byte and low byte, I got it to print a Chinese character:

byte[] bytes = {(byte)250, (byte)65};
String str = new String(bytes, "GBK");
System.out.println(str); // prints: 鶤
System.out.println((int)str.charAt(0)); // prints: 40356

I don't know where your "57345" comes from

Answer 2

5 seconds of Googling turns up http://www.chinesecomputing.com/encodings/index.html . Converting big5 or GBK to unicode is just the identity mapping. I'm not sure what you're doing with your bytes, however, as 250*256+65 = 64065, not 57345.

Answer 3

57345 is 0xE001 in hex, which has no Unicode character defined (see full list here: http://www.unicode.org/Public/UNIDATA/UnicodeData.txt )

But if you do 250*256+65, you'll get 0xFA41, which is

FA41;CJK COMPATIBILITY IDEOGRAPH-FA41;Lo;0;L;654F;;;;N;;;;;

That is, some Asian glyph. May be, that's the way?

Answer 4

Similar to newacct's answer but just to show that it prints this char for other chinese encodings, too:

byte[] b = new byte[] {(byte)250,(byte)65};
String s = new String(b,"GB18030");
OutputStreamWriter fos = new OutputStreamWriter(new FileOutputStream(new File("c:\\a.html")),"GB18030");
fos.write(s);
fos.close();

Prints 鶤

Answer 5

The range of first byte (hibyte) of Big5 is 0xA1 ~ 0xF9; while GBK is 0x81 ~ 0xFE.

Obviously, it's not encoded with Big5. It may be GBK/GB18030.

But GK18030 is downward compatible with GBK.

can someone help me to figure this out ? about unicode

Question

5 answers

solution1
1 2009-10-08 03:50:54

solution2
0 ACCPTED 2009-10-08 03:38:01

solution3
0 2009-10-08 03:38:24

solution4
0 2009-10-08 04:27:29

solution5
0 2010-09-06 06:46:05

can someone help me to figure this out ? about unicode

Question

5 answers

solution1 1 2009-10-08 03:50:54

solution2 0 ACCPTED 2009-10-08 03:38:01

solution3 0 2009-10-08 03:38:24

solution4 0 2009-10-08 04:27:29

solution5 0 2010-09-06 06:46:05

solution1
1 2009-10-08 03:50:54

solution2
0 ACCPTED 2009-10-08 03:38:01

solution3
0 2009-10-08 03:38:24

solution4
0 2009-10-08 04:27:29

solution5
0 2010-09-06 06:46:05