简体   繁体   中英

Delete files from a ZIP archive without Decompressing in Java or maybe Python

Delete files from a ZIP archive without decompressing using Java (Preferred) or Python

Hi,

I work with large ZIP files containing many hundreds of highly compressed text files. When I decompress the ZIP file it can take a while and easily consume up to 20 GB of diskspace. I would like to remove certain files from these ZIP files without having to decompress and recompress only the files I want.

Of course it is certainly possible to do this the long way, but very inefficient.

I would prefer to do this in Java, but will consider Python

I've found this on web

clean solution with only standard library, but I'm not sure whether it's included in android sdk, to be found.

import java.util.*;
import java.net.URI;
import java.nio.file.Path;
import java.nio.file.*;
import java.nio.file.StandardCopyOption;
public class ZPFSDelete {
    public static void main(String [] args) throws Exception {

        /* Define ZIP File System Properies in HashMap */    
        Map<String, String> zip_properties = new HashMap<>(); 
        /* We want to read an existing ZIP File, so we set this to False */
        zip_properties.put("create", "false"); 

        /* Specify the path to the ZIP File that you want to read as a File System */
        URI zip_disk = URI.create("jar:file:/my_zip_file.zip");

        /* Create ZIP file System */
        try (FileSystem zipfs = FileSystems.newFileSystem(zip_disk, zip_properties)) {
            /* Get the Path inside ZIP File to delete the ZIP Entry */
            Path pathInZipfile = zipfs.getPath("source.sql");
            System.out.println("About to delete an entry from ZIP File" + pathInZipfile.toUri() ); 
            /* Execute Delete */
            Files.delete(pathInZipfile);
            System.out.println("File successfully deleted");   
        } 
    }
}

I don't have code to do this, but the basic idea is simple and should translate into almost any language the same way. The ZIP file layout is just a series of blocks that represent files (a header followed by the compressed data), finished off with a central directory that just contains all the metadata. Here's the process:

  1. Scan forward in the file until you find the first file you want to delete.
  2. Scan forward in the file until you find the first file you don't want to delete or you hit the central directory.
  3. Scan forward in the file until you find the first file you want to delete or you hit the central directory.
  4. Copy all the data you found in step 3 back onto the data you skipped in step 2 until you find another file you want to delete or you hit the central directory.
  5. Go to step 2 unless you've hit the central directory.
  6. Copy the central directory to where ever you left off copying, leaving out the entries for the deleted files and changing the offsets to reflect how much you moved each file.

See http://en.wikipedia.org/wiki/ZIP_%28file_format%29 for all the details on the ZIP file structures.

As bestsss suggests, you might want to perform the copying into another file, so as to prevent losing data in the event of a failure.

Yes it is possible for JAVA using library called TRUEZIP .

TrueZIP is a Java based virtual file system (VFS) which enables client applications to perform CRUD (Create, Read, Update, Delete) operations on archive files as if they were virtual directories, even with nested archive files in multithreaded environments

see below link for more information https://truezip.java.net/

Ok think I found a potential solution from www.javaer.org. It definitely deletes files inside the zip and I don't think it is decompressing anything. Here is the code:

public static void deleteZipEntry(File zipFile,
     String[] files) throws IOException {
       // get a temp file
File tempFile = File.createTempFile(zipFile.getName(), null);
       // delete it, otherwise you cannot rename your existing zip to it.
tempFile.delete();
tempFile.deleteOnExit();
boolean renameOk=zipFile.renameTo(tempFile);
if (!renameOk)
{
    throw new RuntimeException("could not rename the file "+zipFile.getAbsolutePath()+" to "+tempFile.getAbsolutePath());
}
byte[] buf = new byte[1024];

ZipInputStream zin = new ZipInputStream(new FileInputStream(tempFile));
ZipOutputStream zout = new ZipOutputStream(new FileOutputStream(zipFile));

ZipEntry entry = zin.getNextEntry();
while (entry != null) {
    String name = entry.getName();
    boolean toBeDeleted = false;
    for (String f : files) {
        if (f.equals(name)) {
            toBeDeleted = true;
            break;
        }
    }
    if (!toBeDeleted) {
        // Add ZIP entry to output stream.
        zout.putNextEntry(new ZipEntry(name));
        // Transfer bytes from the ZIP file to the output file
        int len;
        while ((len = zin.read(buf)) > 0) {
            zout.write(buf, 0, len);
        }
    }
    entry = zin.getNextEntry();
}
// Close the streams        
zin.close();
// Compress the files
// Complete the ZIP file
zout.close();
tempFile.delete();

}

This might be old, but here is one way. And it does work because I use it constantly and it works fine.

public boolean deleteFile(String zip_dir, String subfile){

    delete(new File(zipdir, subfile));

}

private void delete(File file)
{
    if(file == null || !file.exists())
        return;
    if(file.isFile())
    {
        file.delete();
        return;
    }
    File children[] = file.listFiles();
    for(int i = 0; i < children.length; i++)
    {
        File child = children[i];
        if(child.isFile())
            child.delete();
        else
            delete(child);
    }

    file.delete();
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM