简体   繁体   English

从zipfile中读取带有特殊字符的文件

[英]reading files with special characters from a zipfile

My code reads files from a zipfile , which works fine except for files with special characters. 我的代码从zipfile读取文件,该文件工作正常,但带有特殊字符的文件除外。 Problematic character is 'è' ( See my code fère_champenoise ) 有问题的字符是“è”(请参阅​​我的代码fère_champenoise)

String works="(3/(3)_juno.jpa";
     String sdoesntwork="ba/battle of fère_champenoise.jpa";

    ZipFile file1 = null;
    try {
      file1 = new ZipFile(sZipFileOld);
    } catch (IOException e) {
      System.out.println("Could not open zip file " + sZipFileOld + ": " + e);

    }

    try {
        file1.getInputStream(file1.getEntry(sdoesntwork));
    } catch (IOException e) {

        System.out.println(sdoesntwork + ": IO Error " + e);
        e.printStackTrace();
    }

it throws an error but doesn't go throught the exception handler: 它会引发错误,但不会通过异常处理程序:

Exception in thread "main" java.lang.NullPointerException
    at java.util.zip.ZipFile.getInputStream(Unknown Source)
    at ZipCompare.main(ZipCompare.java:56)

Any Solutions ? 有解决方案吗?

When constructing the zipfile, explicitly specifying the encoding: file1 = new ZipFile(sZipFileOld, Charset.forName("IBM437")); 构造zipfile时,显式指定编码: file1 = new ZipFile(sZipFileOld, Charset.forName("IBM437"));

Zip files doesn't use the default UTF-8 encoding for special characters 压缩文件不对特殊字符使用默认的UTF-8编码

I believe you need to specify an encoding, UTF-8 probably. 我相信您可能需要指定一种编码,UTF-8。 Something like this: 像这样:

 final InputStream in = new InputStreamReader(file1.getInputStream(file1.getEntry(sdoesntwork)), "utf-8");

Make sure you remember to close this in a finally. 确保您记得最后关闭它。

The problem is file1.getEntry(sdoesntwork) returns null because it does not find that entry. 问题是file1.getEntry(sdoesntwork)返回null,因为它找不到该条目。 If you are sure this name is correct, then try to use: 如果您确定此名称正确,请尝试使用:

file1 = new ZipFile(sZipFileOld,StandardCharsets.UTF_8);

It doesn't go through your exception handler because is another type of exception, Null pointer exception is thrown because the entry is not found. 它不会通过您的异常处理程序,因为它是另一种异常类型,因为找不到该条目而抛出Null指针异常。 You should check how or with which Charset the file has been define. 您应该检查文件的定义方式或定义方式。

file1 = new ZipFile(sZipFileOld,StandardCharsets.UTF_8); file1 =新的ZipFile(sZipFileOld,StandardCharsets.UTF_8);

charset - The charset to be used to decode the ZIP entry name and comment (ignored if the language encoding bit of the ZIP entry's general purpose bit flag is set). charset-用于解码ZIP条目名称和注释的字符集(如果设置了ZIP条目的通用位标志的语言编码位,则忽略该字符集)。

if the zip entry and its comment is ASCII, it is not necessary to use this way to construct the ZipFile. 如果zip条目及其注释为ASCII,则不必使用这种方式来构造ZipFile。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM