当我将编码设置为UTF-16时，为什么FileInputStream readline返回null？

Question

It works fine with UTF-8 , and it also works fine with UTF-16 if I use different file. 如果使用其他文件，它可以与UTF-8 ，也可以与UTF-16使用。

BufferedReader br = new BufferedReader(new InputStreamReader(new 
FileInputStream(filePath), "UTF-16"));

If I replace UTF-16 with UTF-8 in above code, everything works as expected, why is that? 如果我在上面的代码中将UTF-16替换为UTF-8 ，那么一切都会按预期进行，这是为什么呢？

Suggested answer is different because I just need to read the file. 建议的答案有所不同，因为我只需要阅读文件。 Answer was simple, I can't read UTF-16 if the file is UTF-8. 答案很简单，如果文件是UTF-8，我将无法读取UTF-16。

Answer 1

Check the encoding of your files. 检查文件的编码。 UTF-16 can be encoded using Big Endian (UTF-16BE) or Little Endian (UTF-16LE). 可以使用Big Endian（UTF-16BE）或Little Endian（UTF-16LE）对UTF-16进行编码。 These are different. 这些是不同的。

This code works for four variants of the same file. 此代码适用于同一文件的四个变体。

import java.io.BufferedReader;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStreamReader;
import java.nio.charset.Charset;

public class SOPlayground {

    public static void main(String[] args) throws Exception {
        readAndPrint("/tmp/u-8.txt", Charset.forName("UTF-8"));
        readAndPrint("/tmp/u-16.txt", Charset.forName("UTF-16"));
        readAndPrint("/tmp/u-16le.txt", Charset.forName("UTF-16LE"));
        readAndPrint("/tmp/u-16be.txt", Charset.forName("UTF-16BE"));
    }

    private static void readAndPrint(String filePath, final Charset charset) throws IOException, FileNotFoundException {
        final BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(filePath), charset));
        String line = br.readLine();
        while (line != null) {
            System.out.println(line);
            line = br.readLine();
        }
    }
}

On GNU/Linux you can check the encoding using the file tool: 在GNU / Linux上，您可以使用file工具检查编码：

/tmp % file u*.txt
u-16be.txt: data
u-16le.txt: data
u-16.txt:   Little-endian UTF-16 Unicode text, with no line terminators
u-8.txt:    UTF-8 Unicode text

The content of these files are all different: 这些文件的内容都是不同的：

/tmp % cat u*.txt
����
����
������
üäöü

But using the above Java code, they can be read correctly. 但是使用上面的Java代码，可以正确读取它们。 The output of my Java code is: 我的Java代码的输出是：

üäöü
üäöü
üäöü
üäöü

当我将编码设置为UTF-16时，为什么FileInputStream readline返回null？

问题描述

1 个解决方案

解决方案1
1 已采纳

当我将编码设置为UTF-16时，为什么FileInputStream readline返回null？

问题描述

1 个解决方案

解决方案1 1 已采纳

解决方案1
1 已采纳