简体   繁体   English

Java:如何从inputStream获取编码?

[英]Java:How can i get the encoding from inputStream?

I want get the encoding from a stream. 我想从流中获取编码。

1st method - to use the InputStreamReader. 第一种方法-使用InputStreamReader。

But it always return OS encode. 但是它总是返回OS编码。

InputStreamReader reader = new InputStreamReader(new FileInputStream("aa.rar"));
System.out.println(reader.getEncoding());

output:GBK 输出:GBK

2nd method - to use the UniversalDetector. 第二种方法-使用UniversalDetector。

But it always return null. 但是它总是返回null。

    FileInputStream input = new FileInputStream("aa.rar");

    UniversalDetector detector = new UniversalDetector(null);
    byte[] buf = new byte[4096];

    int nread;
    while ((nread = input.read(buf)) > 0 && !detector.isDone()) {
        detector.handleData(buf, 0, nread);
    }

    // (3)
    detector.dataEnd();

    // (4)
    String encoding = detector.getDetectedCharset();

    if (encoding != null) {
        System.out.println("Detected encoding = " + encoding);
    } else {
        System.out.println("No encoding detected.");
    }

    // (5)
    detector.reset();

output:null 输出:空

How can I get the right? 我怎样才能正确? :( :(

Let's resume the situation: 让我们恢复情况:

  • InputStream delivers bytes InputStream传送字节
  • *Readers deliver chars in some encoding *阅读器以某种编码传递字符
  • new InputStreamReader(inputStream) uses the operating system encoding 新的InputStreamReader(inputStream)使用操作系统编码
  • new InputStreamReader(inputStream, "UTF-8") uses the given encoding (here UTF-8) 新的InputStreamReader(inputStream,“ UTF-8”)使用给定的编码(此处为UTF-8)

So one needs to know the encoding before reading. 因此,在阅读之前,需要了解编码。 You did everything right using first a charset detecting class. 您首先使用字符集检测类正确完成了所有操作。

Reading http://code.google.com/p/juniversalchardet/ it should handle UTF-8 and UTF-16. 阅读http://code.google.com/p/juniversalchardet/时,它应该处理UTF-8和UTF-16。 You might use the editor JEdit to verify the encoding, and see whether there is some problem. 您可以使用编辑器JEdit来验证编码,并查看是否存在问题。

    public String getDecoder(InputStream inputStream) {

    String encoding = null;

    try {
        byte[] buf = new byte[4096];
        UniversalDetector detector = new UniversalDetector(null);
        int nread;

        while ((nread = inputStream.read(buf)) > 0 && !detector.isDone()) {
            detector.handleData(buf, 0, nread);
        }

        detector.dataEnd();
        encoding = detector.getDetectedCharset();
        detector.reset();

        inputStream.close();

    } catch (Exception e) {
    }

    return encoding;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM