简体   繁体   English

使用python解压缩大文件

[英]unzipping large files using python

I am attempting to unzip files of various sizes (some are 4GB or above in size) using python, however I have noticed that on several occasions especially when the files are extremely large the file fails to unzip. 我正在尝试使用python解压缩各种大小(某些大小为4GB或更大)的文件,但是我注意到在某些情况下,尤其是当文件过大时,文件无法解压缩。 When I open the new result file it is empty. 当我打开新的结果文件时,它是空的。 Below is the code i am using - is there anything wrong with my approach? 以下是我使用的代码-我的方法有什么问题吗?

        inF = gzip.open(localFile, 'rb')
        localFile = localFile[:-3]
        outF = open(localFile, 'wb')
        outF.write( inF.read() )
        inF.close()
        outF.close()

in this case it looks like you don't need python to do any processing on the file you read in so you might be better off just using subprocess.Popen : 在这种情况下,您似乎不需要python对读取的文件进行任何处理,因此,最好只使用subprocess.Popen

from subprocess import Popen
Popen('gunzip %s %s' % (infilename, outfilename)).wait()

you might need to pass shell=True , but other than that should be good 您可能需要传递shell=True ,但shell=True应该很好

Another solution for large .zip files (works on Ubuntu 16.04.4). 大型.zip文件的另一种解决方案(适用于Ubuntu 16.04.4)。 First install 7z: 首次安装7z:

sudo apt-get install p7zip-full

Then in your python code, call 7zip with: 然后在您的python代码中,使用以下命令调用7zip:

import subprocess
subprocess.call(['7z', 'x', src_file, '-o'+target_dir])

This code loops of blocks of input data, writing each to an output file. 此代码循环输入数据块,并将每个数据块写入输出文件。 In this way we don't read the entire input into memory at once, conserving memory and avoiding mysterious crashes. 这样,我们不会立即将整个输入读取到内存中,从而节省了内存并避免了神秘的崩溃。

import gzip, os

localFile = 'cat.gz'
outFile = os.path.splitext(localFile)[0]

print 'Unzipping {} to {}'.format(localFile, outFile)

with gzip.open(localFile, 'rb') as inF:
    with open( outFile, 'wb') as outF:
        outF.write( inF.read(size=1024) )

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM