简体   繁体   English

从 ZIP 存档中删除文件,而不在 Java 或 Python 中解压缩

[英]Delete files from a ZIP archive without Decompressing in Java or maybe Python

Delete files from a ZIP archive without decompressing using Java (Preferred) or Python从 ZIP 存档中删除文件,无需使用 Java(首选)或 Python 解压缩

Hi,你好,

I work with large ZIP files containing many hundreds of highly compressed text files.我使用包含数百个高度压缩文本文件的大型 ZIP 文件。 When I decompress the ZIP file it can take a while and easily consume up to 20 GB of diskspace.当我解压缩 ZIP 文件时,它可能需要一段时间,并且很容易消耗多达 20 GB 的磁盘空间。 I would like to remove certain files from these ZIP files without having to decompress and recompress only the files I want.我想从这些 ZIP 文件中删除某些文件,而不必仅解压缩和重新压缩我想要的文件。

Of course it is certainly possible to do this the long way, but very inefficient.当然,这样做当然是可能的,但效率很低。

I would prefer to do this in Java, but will consider Python我更愿意在 Java 中执行此操作,但会考虑 Python

I've found this on web 我在网上发现了这个

clean solution with only standard library, but I'm not sure whether it's included in android sdk, to be found. 只有标准库的干净解决方案,但我不确定它是否包含在android sdk中,有待找到。

import java.util.*;
import java.net.URI;
import java.nio.file.Path;
import java.nio.file.*;
import java.nio.file.StandardCopyOption;
public class ZPFSDelete {
    public static void main(String [] args) throws Exception {

        /* Define ZIP File System Properies in HashMap */    
        Map<String, String> zip_properties = new HashMap<>(); 
        /* We want to read an existing ZIP File, so we set this to False */
        zip_properties.put("create", "false"); 

        /* Specify the path to the ZIP File that you want to read as a File System */
        URI zip_disk = URI.create("jar:file:/my_zip_file.zip");

        /* Create ZIP file System */
        try (FileSystem zipfs = FileSystems.newFileSystem(zip_disk, zip_properties)) {
            /* Get the Path inside ZIP File to delete the ZIP Entry */
            Path pathInZipfile = zipfs.getPath("source.sql");
            System.out.println("About to delete an entry from ZIP File" + pathInZipfile.toUri() ); 
            /* Execute Delete */
            Files.delete(pathInZipfile);
            System.out.println("File successfully deleted");   
        } 
    }
}

I don't have code to do this, but the basic idea is simple and should translate into almost any language the same way. 我没有代码可以做到这一点,但基本的想法很简单,几乎可以翻译成任何语言。 The ZIP file layout is just a series of blocks that represent files (a header followed by the compressed data), finished off with a central directory that just contains all the metadata. ZIP文件布局只是一系列代表文件的块(标题后跟压缩数据),最后是一个只包含所有元数据的中心目录。 Here's the process: 这是过程:

  1. Scan forward in the file until you find the first file you want to delete. 在文件中向前扫描,直到找到要删除的第一个文件。
  2. Scan forward in the file until you find the first file you don't want to delete or you hit the central directory. 快进文件中,直到你找到你不想删除的第一个文件或者你打的中央目录。
  3. Scan forward in the file until you find the first file you want to delete or you hit the central directory. 在文件中向前扫描,直到找到要删除的第一个文件或者您到达中心目录。
  4. Copy all the data you found in step 3 back onto the data you skipped in step 2 until you find another file you want to delete or you hit the central directory. 将在步骤3中找到的所有数据复制回您在步骤2中跳过的数据,直到找到要删除的其他文件或者您到达中央目录。
  5. Go to step 2 unless you've hit the central directory. 除非您点击中央目录,否则请转到第2步。
  6. Copy the central directory to where ever you left off copying, leaving out the entries for the deleted files and changing the offsets to reflect how much you moved each file. 将中央目录复制到复制停止的位置,省略已删除文件的条目并更改偏移量以反映每个文件的移动量。

See http://en.wikipedia.org/wiki/ZIP_%28file_format%29 for all the details on the ZIP file structures. 有关ZIP文件结构的所有详细信息,请参见http://en.wikipedia.org/wiki/ZIP_%28file_format%29

As bestsss suggests, you might want to perform the copying into another file, so as to prevent losing data in the event of a failure. 最好的建议是,您可能希望将复制执行到另一个文件中,以防止在发生故障时丢失数据。

Yes it is possible for JAVA using library called TRUEZIP . 是的,JAVA可以使用名为TRUEZIP的库。

TrueZIP is a Java based virtual file system (VFS) which enables client applications to perform CRUD (Create, Read, Update, Delete) operations on archive files as if they were virtual directories, even with nested archive files in multithreaded environments TrueZIP是一个基于Java的虚拟文件系统(VFS),它使客户端应用程序能够对存档文件执行CRUD(创建,读取,更新,删除)操作,就像它们是虚拟目录一样,即使在多线程环境中使用嵌套存档文件也是如此

see below link for more information https://truezip.java.net/ 有关更多信息, 参见以下链接https://truezip.java.net/

Ok think I found a potential solution from www.javaer.org. 好吧,我想从www.javaer.org找到了一个潜在的解决方案。 It definitely deletes files inside the zip and I don't think it is decompressing anything. 它肯定会删除zip中的文件,我认为它不会解压缩任何东西。 Here is the code: 这是代码:

public static void deleteZipEntry(File zipFile,
     String[] files) throws IOException {
       // get a temp file
File tempFile = File.createTempFile(zipFile.getName(), null);
       // delete it, otherwise you cannot rename your existing zip to it.
tempFile.delete();
tempFile.deleteOnExit();
boolean renameOk=zipFile.renameTo(tempFile);
if (!renameOk)
{
    throw new RuntimeException("could not rename the file "+zipFile.getAbsolutePath()+" to "+tempFile.getAbsolutePath());
}
byte[] buf = new byte[1024];

ZipInputStream zin = new ZipInputStream(new FileInputStream(tempFile));
ZipOutputStream zout = new ZipOutputStream(new FileOutputStream(zipFile));

ZipEntry entry = zin.getNextEntry();
while (entry != null) {
    String name = entry.getName();
    boolean toBeDeleted = false;
    for (String f : files) {
        if (f.equals(name)) {
            toBeDeleted = true;
            break;
        }
    }
    if (!toBeDeleted) {
        // Add ZIP entry to output stream.
        zout.putNextEntry(new ZipEntry(name));
        // Transfer bytes from the ZIP file to the output file
        int len;
        while ((len = zin.read(buf)) > 0) {
            zout.write(buf, 0, len);
        }
    }
    entry = zin.getNextEntry();
}
// Close the streams        
zin.close();
// Compress the files
// Complete the ZIP file
zout.close();
tempFile.delete();

} }

This might be old, but here is one way. 这可能是旧的,但这是一种方式。 And it does work because I use it constantly and it works fine. 它确实有效,因为我经常使用它并且工作正常。

public boolean deleteFile(String zip_dir, String subfile){

    delete(new File(zipdir, subfile));

}

private void delete(File file)
{
    if(file == null || !file.exists())
        return;
    if(file.isFile())
    {
        file.delete();
        return;
    }
    File children[] = file.listFiles();
    for(int i = 0; i < children.length; i++)
    {
        File child = children[i];
        if(child.isFile())
            child.delete();
        else
            delete(child);
    }

    file.delete();
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM