简体   繁体   English

使用 Python 压缩大文件,速度非常快

[英]Compress Large Files with Python, really fast

I'm using this function to gzip a file:我正在使用这个 function 压缩文件:


def zip_file(path_data,path_zip,File):
    with open(os.path.join(path_data,File), "rb") as f_in, gzip.open(os.path.join(path_zip,File) + ".gz", "wb") as f_out:
        shutil.copyfileobj(f_in, f_out,length=16*1024*1024)

But it takes 1604.954 seconds to gzip a 14 GB file with 4 columns, I have to process 96 files like this.但是 gzip 一个 14 GB 的 4 列文件需要 1604.954 秒,我必须像这样处理 96 个文件。

Add a parameter to your gzip.open with compresslevel=1 .使用compresslevel=1向您的gzip.open添加一个参数。 You can play with the level between 1 and 5 (default is 6, which apparently you don't like).您可以玩 1 到 5 之间的级别(默认为 6,显然您不喜欢)。 See where you prefer the trade off in time vs. compression ratio.查看您更喜欢在时间与压缩率之间进行权衡的地方。

By the way, you shouldn't call it "zip_file".顺便说一句,您不应该将其称为“zip_file”。 It is not a zip file, which is an entirely different thing from a gzip file.它不是 zip 文件,它与 gzip 文件完全不同。 Call it "gzip_file", or something else.将其称为“gzip_file”或其他名称。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM