简体   繁体   English

使用位模式将Widechar转换为Bytes?

[英]Widechar to Bytes using bits pattern?

If the number of bytes in UTF-8 encoded wide char is known, would it be possible get bytes using the following method? 如果知道以UTF-8编码的宽字符的字节数,是否可以使用以下方法获取字节?

For example: 例如:

Wide character ¿ code 191 to bytes gives -62 and -65 宽字符¿191到字节使-62-65

I've tried to fit the 8 bits in 191 into the slots but didn't get the same result 我尝试将191的8位插入插槽中,但没有得到相同的结果

110[0][0][0][1][0]   10[1][1][1][1][1][1]

      127                   255

First, don't convert to signed bytes. 首先,不要转换为带符号的字节。 That just confuses matters. 那只会使事情变得混乱。 So code point 191 yields the byte sequence 194 191 因此代码点191产生字节序列194191

Decimal: 194                   191
Binary:  110[0][0][0][1][0]    10[1][1][1][1][1][1]

To generate these bytes, you start from the right edge. 要生成这些字节,请从右边缘开始。 You get six bits from the 191 and two more from the 194, with an additional three bits leftover, yielding: 您可以从191中获得6位,从194中获得2位,剩下的3位将产生:

Binary:  00000[0][0][0]    [1][0][1][1][1][1][1][1]
Decimal: 0                 191

Wikipedia has a surprisingly good writeup on how this all works. Wikipedia关于这一切的工作原理出奇的好文章。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM