如何从UTF-8文件中的字符位置转换为字节位置？

Question

I have UTF-8 encoded text file. 我有UTF-8编码的文本文件。 I can read it by chars. 我可以用字符读它。 Each char can be either one byte or multibyte. 每个字符可以是一个字节或多字节。 How can I know where one byte was readen and whet it was readen more than one byte? 我怎么知道在哪里读取了一个字节而又在哪里读取了一个以上的字节？

Answer 1

Count the bytes while reading the char s. 读取char计算字节数。

For each char c : 对于每个char c ：

if(c<128)
  bytesCount++;
else if (c<2048)
  bytesCount+=2;
else
  bytesCount+=3;

See also encodeing definition wikipedia URF8 另请参见编码定义Wikipedia URF8

如何从UTF-8文件中的字符位置转换为字节位置？

问题描述

1 个解决方案

解决方案1
0 2013-02-08 23:35:19

如何从UTF-8文件中的字符位置转换为字节位置？

问题描述

1 个解决方案

解决方案1 0 2013-02-08 23:35:19

解决方案1
0 2013-02-08 23:35:19