简体   繁体   English

解压多个zip文件时看不懂中文

[英]Can't read Chinese words when unzipping multiple zip files

My program performs well in test1 to test3 txt.我的程序在test1test3 txt 中表现良好。 However, it occurs some error when reading你好.txt.但是,在读取你好.txt 时出现了一些错误。 I would like to ask how I can modify my program to fix this problem.我想问一下如何修改我的程序来解决这个问题。

Here is the folder structure.这是文件夹结构。 A note folder contains two zip folders which are zipFile1 and zipFile2 . note文件夹包含两个 zip 文件夹,分别是zipFile1zipFile2 I appreciate if anyone can answer my question.如果有人能回答我的问题,我很感激。

Operating System: Window 10操作系统:Window 10

Error message: Exception in thread "main" java.util.zip.ZipException: invalid CEN header (bad entry name)错误消息: Exception in thread "main" java.util.zip.ZipException: invalid CEN header (bad entry name)

Version of java: openjdk version "1.8.0_191-1-ojdkbuild java openjdk version "1.8.0_191-1-ojdkbuildopenjdk version "1.8.0_191-1-ojdkbuild

├── note
    ├── zipFile1.zip
            ├── test1.txt
            ├── test2.txt

    ├── zipFile2.zip
            ├── test3.txt
            ├── 你好.txt

Here is my program.这是我的程序。

public class test {

    
    private static final String SOURCE_FOLDER = "note_folder_path";
    
    static File folder = new File(SOURCE_FOLDER);
    static File[] files = folder.listFiles();
    
    final static Charset CHINESE_CHARSET = Charset.forName("MS950");

    public static void main(String[] args) throws IOException
    {
        
        for (File file:files) {
            extractFolder(file.getAbsolutePath());
        }
        
    }
    
    public static void extractFolder(String zipFile) throws IOException {
        int buffer = 2048;
        File file = new File(zipFile);

        try (ZipFile zip = new ZipFile(file,CHINESE_CHARSET))
        
        {
          String newPath = zipFile.substring(0, zipFile.length() - 4);

          new File(newPath).mkdir();
          Enumeration<? extends ZipEntry> zipFileEntries = zip.entries();

          // Process each entry
          while (zipFileEntries.hasMoreElements()) {
            // grab a zip file entry
            ZipEntry entry = zipFileEntries.nextElement();
            String currentEntry = entry.getName();
            File destFile = new File(newPath, currentEntry);
            File destinationParent = destFile.getParentFile();

            // create the parent directory structure if needed
            destinationParent.mkdirs();

            if (!entry.isDirectory()) {
              BufferedInputStream is = new BufferedInputStream(zip.getInputStream(entry));
              int currentByte;
              // establish buffer for writing file
              byte[] data = new byte[buffer];

              // write the current file to disk
              FileOutputStream fos = new FileOutputStream(destFile);
   
              try (BufferedOutputStream dest = new BufferedOutputStream(fos, buffer)) {

                // read and write until last byte is encountered
                while ((currentByte = is.read(data, 0, buffer)) != -1) {
                  dest.write(data, 0, currentByte);
                }
                
                
                dest.flush();
                is.close();
              }
            }

            if (currentEntry.endsWith(".zip")) {
              // found a zip file, try to open
              extractFolder(destFile.getAbsolutePath());
            }
          }
        }
    }
}

This is a filename encoding issue and nothing to do with the contents of你好.txt .这是一个文件名编码问题,与你好.txt的内容无关。

Looking at the source code for java.util.zip here , the error message invalid CEN header (bad entry name) is output这里查看java.util.zip的源代码,输出错误信息invalid CEN header (bad entry name)

  1. if the language encoding bit is set in the zip file and the filename is not valid UTF-8如果在 zip 文件中设置了语言编码位并且文件名无效 UTF-8
  2. when UTF-8 is not being used and the encoding for the filename does not match the encoding name stored in the encoding environment variable.当未使用 UTF-8 且文件名的编码与encoding环境变量中存储的编码名称不匹配时。

See Setting the default Java character encoding for details on using the Java encoding environment variable.有关使用 Java encoding环境变量的详细信息,请参阅设置默认 Java 字符编码

To know for sure what the issue is, can you to share the zip file?要确定问题是什么,您可以共享 zip 文件吗?

If not, can you post a readout of the internal structure of the file by running zipdetails against the zip file.如果没有,您能否通过对 zip 文件运行zipdetails文件内部结构的读数。 Usage is用法是

zipdetails -v  zipFile2.zip

This program is present in most recent Linux distributions.该程序出现在最新的 Linux 发行版中。 As you are running Windows can get access to this script with the WSL if you don't have access to a Linux distrubution.当您运行 Windows 时,如果您无权访问 Linux 发行版,则可以使用WSL访问此脚本。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM