简体   繁体   中英

unzipping large files using python

I am attempting to unzip files of various sizes (some are 4GB or above in size) using python, however I have noticed that on several occasions especially when the files are extremely large the file fails to unzip. When I open the new result file it is empty. Below is the code i am using - is there anything wrong with my approach?

        inF = gzip.open(localFile, 'rb')
        localFile = localFile[:-3]
        outF = open(localFile, 'wb')
        outF.write( inF.read() )
        inF.close()
        outF.close()

in this case it looks like you don't need python to do any processing on the file you read in so you might be better off just using subprocess.Popen :

from subprocess import Popen
Popen('gunzip %s %s' % (infilename, outfilename)).wait()

you might need to pass shell=True , but other than that should be good

Another solution for large .zip files (works on Ubuntu 16.04.4). First install 7z:

sudo apt-get install p7zip-full

Then in your python code, call 7zip with:

import subprocess
subprocess.call(['7z', 'x', src_file, '-o'+target_dir])

This code loops of blocks of input data, writing each to an output file. In this way we don't read the entire input into memory at once, conserving memory and avoiding mysterious crashes.

import gzip, os

localFile = 'cat.gz'
outFile = os.path.splitext(localFile)[0]

print 'Unzipping {} to {}'.format(localFile, outFile)

with gzip.open(localFile, 'rb') as inF:
    with open( outFile, 'wb') as outF:
        outF.write( inF.read(size=1024) )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM