简体繁体中英

Implementing a Character Encoding in Java

原文 2017-08-03 10:14:04 9 1 java/ unicode/ encoding/ ascii

I was asked this question during an interview with a famous IT company. They asked me to suggest how a character encoding will be implemented if we have lots of characters & 16 bits of Unicode are not enough. I answered we can implement 64 bit encoding for characters. They said, even it's not enough, to which I suggested to implement a encoding via java BigInteger .

Then they asked the encoding should be such that it only takes the bits that are needed. Like ASCII representation of A is 01000001 , we should not be using the leading 0 because we don't need it and we are wasting memory. I could not give an answer to it. If you could please tell me about how to approach this problem and how it is handled.

1 answers

See the Unicode Standard, Chapter 3: "The Unicode Standard supports three character encoding forms: UTF-32, UTF-16, and UTF-8. Each encoding form maps the Unicode code points U+0000..U+D7FF and U+E000..U+10FFFF to unique code unit sequences. The size of the code unit is specified for each encoding form. This section presents the formal definition of each of these encoding forms."

As regards the question on saving bits, this is meaningful only when the text is very large, in which case I would suggest using compression, such as zip. There are solutions in various languages that let you read from and write to a compressed file directly.

Java Lithuanian character encoding

character encoding in java

Character Encoding Trouble - Java

Character encoding in Java not working

Java/XSLT character encoding

Internal character encoding of Java 7

Character Encoding in Java

Java Unix character encoding

Java zip character encoding

Java Character Encoding

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Java Lithuanian character encoding character encoding in java Character Encoding Trouble - Java Character encoding in Java not working Java/XSLT character encoding Internal character encoding of Java 7 Character Encoding in Java Java Unix character encoding Java zip character encoding Java Character Encoding

Related Tags

Implementing a Character Encoding in Java

Question

1 answers

solution1 1 2017-08-03 10:32:51

solution1
1 2017-08-03 10:32:51