InputStreamReader是否应与appendCodePoint一起使用？

Question

It is a common pattern in Java to read characters from a file with InputStreamReader and append them to a StringBuilder; 在Java中，使用InputStreamReader从文件读取字符并将其附加到StringBuilder是一种常见的模式。 the obvious way to do it is like: 显而易见的方法是：

int c = reader.read();
sb.append((char)c);

However, supposing the file (assuming we specified UTF-8 encoding if it makes a difference) were to contain a character (strictly speaking a code point) that doesn't fit in 16 bits. 但是，假设文件（假设我们指定了UTF-8编码，如果有区别的话）将包含一个不适合16位的字符（严格来说是一个代码点）。 Would the reader return this as a single 32-bit code point instead of a pair of 16-bit chars? 读者会将它作为单个32位代码点而不是一对16位字符返回吗？

If so, should the last line above actually read like: 如果是这样，那么上面的最后一行实际上应该是这样的：

sb.appendCodePoint(c);

Is there a known test case - a sequence of UTF-8 bytes - that would distinguish between the two options? 是否有一个已知的测试用例（一系列UTF-8字节）可以区分这两种选择？

Answer 1

The Reader returns whatever it can make of the next piece of input, as a single character, as the Javadoc says. 正如Javadoc所说，Reader以单个字符的形式返回它对下一个输入所做的一切。 The only exception is the EOS indicator, which is -1 as an int. 唯一的例外是EOS指标，它作为int.是-1 int. There is no basis for your suggestion. 您的建议没有根据。

InputStreamReader是否应与appendCodePoint一起使用？

问题描述

1 个解决方案

解决方案1
1 已采纳 2013-10-01 05:06:51

InputStreamReader是否应与appendCodePoint一起使用？

问题描述

1 个解决方案

解决方案1 1 已采纳 2013-10-01 05:06:51

解决方案1
1 已采纳 2013-10-01 05:06:51