简体   繁体   中英

How to print the percentage of zipping a file python

I would like to get the percentage a file is at while zipping it. For instance it will print 1%, 2%, 3%, etc. I have no idea on where to start. How would I go about doing this right now I just have the code to zip the file.

Code:

zipPath = zipfile.ZipFile("Files/Zip/" + pic + ".zip", "w")

for root, dirs, files in os.walk(filePath):
    for file in files:
        zipPath.write(os.path.join(root, file), str(pic) + "\\" + file)

print("Done")
zipPath.close()

Unfortunately, you can't get progress on the compression of each individual file from the zipfile module, but you can get an idea of the total progress by keeping track of how many bytes you've processed so far.

As Mikko Ohtamaa suggested, the easiest way to do this is to walk through the file list twice, first to determine the file sizes, and second to do the compression. However, as Kevin mentioned the contents of the directory could change between these two passes, so the numbers may be inaccurate.

The program below (written for Python 2.6) illustrates the process.

#!/usr/bin/env python

''' zip all the files in dirname into archive zipname

    Use only the last path component in dirname as the 
    archive directory name for all files

    Written by PM 2Ring 2015.02.15

    From http://stackoverflow.com/q/28522669/4014959
'''

import sys
import os
import zipfile


def zipdir(zipname, dirname):
    #Get total data size in bytes so we can report on progress
    total = 0
    for root, dirs, files in os.walk(dirname):
        for fname in files:
            path = os.path.join(root, fname)
            total += os.path.getsize(path)

    #Get the archive directory name
    basename = os.path.basename(dirname)

    z = zipfile.ZipFile(zipname, 'w', zipfile.ZIP_DEFLATED)

    #Current data byte count
    current = 0
    for root, dirs, files in os.walk(dirname):
        for fname in files:
            path = os.path.join(root, fname)
            arcname = os.path.join(basename, fname)
            percent = 100 * current / total
            print '%3d%% %s' % (percent, path)

            z.write(path, arcname)
            current += os.path.getsize(path)
    z.close()


def main():
    if len(sys.argv) < 3:
        print 'Usage: %s zipname dirname' % sys.argv[0]
        exit(1)

    zipname = sys.argv[1]
    dirname = sys.argv[2]
    zipdir(zipname, dirname)


if __name__ == '__main__':
    main()

Note that I open the zip file with the zipfile.ZIP_DEFLATED compression argument; the default is zipfile.ZIP_STORED , ie, no compression is performed. Also, zip files can cope with both DOS-style and Unix-style path separators, so you don't need to use backslashes in your archive pathnames, and as my code shows you can just use os.path.join() to construct the archive pathname.


BTW, in your code you have str(pic) inside your inner for loop. In general, it's a bit wasteful re-evaluating a function with a constant argument inside a loop. But in this case, it's totally superfluous, since from your first statement it appears that pic is already a string.

The existing answer works only on a file level, ie if you have a single huge file to zip you would not see any progress until the whole operation is finished. In my case I just had one huge file, and I did something like this:

import os
import types
import zipfile
from functools import partial

if __name__ == '__main__':
    out_file = "out.bz2"
    in_file = "/path/to/file/to/zip"

    def progress(total_size, original_write, self, buf):
        progress.bytes += len(buf)
        progress.obytes += 1024 * 8  # Hardcoded in zipfile.write
        print("{} bytes written".format(progress.bytes))
        print("{} original bytes handled".format(progress.obytes))
        print("{} % done".format(int(100 * progress.obytes / total_size)))
        return original_write(buf)
    progress.bytes = 0
    progress.obytes = 0

    with zipfile.ZipFile(out_file, 'w', compression=zipfile.ZIP_DEFLATED) as _zip:
        # Replace original write() with a wrapper to track progress
        _zip.fp.write = types.MethodType(partial(progress, os.path.getsize(in_file),
                                                 _zip.fp.write), _zip.fp)
        _zip.write(in_file)

Not optimal since there is a hardcoded number of bytes handled per call to write() which could change.

Also the function is called quite frequently, updating a UI should probably not be done for every call.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM