简体   繁体   中英

Python ZipFile module extracts password protected zips slowly

i am trying to write a python-script, which should extract a zip file:

Board: Beagle-Bone black ~ 1GHz Arm-Cortex-a8 , debian wheezy Zipfile: /home/milo/my.zip, ~ 8 MB

>>> from zipfile import ZipFile
>>> zip = ZipFile("/home/milo/my.zip")
>>> zip.extractall(pwd="tst")

other solutions with opening and reading-> writing the zipfile and extracting even particular file have the same effect. extracting take about 3-4 minutes.

Extracting the same file with just using unzip-tool takes less than 2 seconds.

Does anyone know what is wonrg with my code, or even with python zipfile lib??

Thanks Ajava

This seems to be a documented issue with the ZipFile module in Python 2.7. If you look at the documentation for ZipFile , it clearly mentions:

Decryption is extremely slow as it is implemented in native Python rather than C.

If you need faster performance, you can either invoke an an external program (like unzip or 7zip) from your code, or make sure the zip files you are working with are not password protected.

Copy from my answer https://stackoverflow.com/a/72513075/10860732

It's quite stupid that Python doesn't implement zip decryption in pure c.

So I make it in cython, which is 17 times faster.

Just get the dezip.pyx and setup.py from this gist.

https://gist.github.com/zylo117/cb2794c84b459eba301df7b82ddbc1ec

And install cython and build a cython library

pip3 install cython
python3 setup.py build_ext --inplace

Then run the original script with two more lines.

import zipfile

# add these two lines
from dezip import _ZipDecrypter_C
setattr(zipfile, '_ZipDecrypter', _ZipDecrypter_C)

z = zipfile.ZipFile('./test.zip', 'r')
z.extractall('/tmp/123', None, b'password')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM