简体   繁体   English

如何使用 Python 创建完整的压缩 tar 文件?

[英]How to create full compressed tar file using Python?

How can I create a.tar.gz file with compression in Python?如何在 Python 中创建压缩后的 .tar.gz 文件?

To build a .tar.gz (aka .tgz ) for an entire directory tree:为整个目录树构建一个.tar.gz (又名.tgz ):

import tarfile
import os.path

def make_tarfile(output_filename, source_dir):
    with tarfile.open(output_filename, "w:gz") as tar:
        tar.add(source_dir, arcname=os.path.basename(source_dir))

This will create a gzipped tar archive containing a single top-level folder with the same name and contents as source_dir .这将创建一个 gzipped tar 存档,其中包含一个与source_dir名称和内容相同的顶级文件夹。

import tarfile
tar = tarfile.open("sample.tar.gz", "w:gz")
for name in ["file1", "file2", "file3"]:
    tar.add(name)
tar.close()

If you want to create a tar.bz2 compressed file, just replace file extension name with ".tar.bz2" and "w:gz" with "w:bz2".如果要创建 tar.bz2 压缩文件,只需将文件扩展名替换为“.tar.bz2”,将“w:gz”替换为“w:bz2”。

You call tarfile.open with mode='w:gz' , meaning "Open for gzip compressed writing."您使用mode='w:gz'调用tarfile.open ,意思是“打开以进行 gzip 压缩写入”。

You'll probably want to end the filename (the name argument to open ) with .tar.gz , but that doesn't affect compression abilities.您可能希望使用.tar.gz结束文件名( openname参数),但这不会影响压缩能力。

BTW, you usually get better compression with a mode of 'w:bz2' , just like tar can usually compress even better with bzip2 than it can compress with gzip .顺便说一句,您通常使用'w:bz2'模式获得更好的压缩,就像tar通常使用bzip2压缩比使用gzip压缩更好。

Previous answers advise using the tarfile Python module for creating a .tar.gz file in Python.以前的答案建议使用tarfile Python 模块在 Python 中创建.tar.gz文件。 That's obviously a good and Python-style solution, but it has serious drawback in speed of the archiving.这显然是一个很好的 Python 风格的解决方案,但它在归档速度方面存在严重缺陷。 This question mentions that tarfile is approximately two times slower than the tar utility in Linux. 这个问题提到tarfile比 Linux 中的tar实用程序慢大约两倍。 According to my experience this estimation is pretty correct.根据我的经验,这个估计是非常正确的。

So for faster archiving you can use the tar command using subprocess module:因此,为了更快地归档,您可以使用tar命令使用subprocess模块:

subprocess.call(['tar', '-czf', output_filename, file_to_archive])

In addition to @Aleksandr Tukallo's answer, you could also obtain the output and error message (if occurs).除了@Aleksandr Tukallo 的回答,您还可以获得输出和错误消息(如果发生)。 Compressing a folder using tar is explained pretty well on the following answer .以下答案很好地解释了使用tar压缩文件夹。

import traceback
import subprocess

try:
    cmd = ['tar', 'czfj', output_filename, file_to_archive]
    output = subprocess.check_output(cmd).decode("utf-8").strip() 
    print(output)          
except Exception:       
    print(f"E: {traceback.format_exc()}")       

In this tar.gz file compress in open view directory In solve use os.path.basename(file_directory)在这个 tar.gz 文件中压缩在打开的视图目录中解决使用 os.path.basename(file_directory)

import tarfile

with tarfile.open("save.tar.gz","w:gz") as tar:
      for file in ["a.txt","b.log","c.png"]:
           tar.add(os.path.basename(file))

its use in tar.gz file compress in directory它在 tar.gz 文件中的使用 压缩在目录中

Minor correction to @THAVASI.T's answer which omits showing the import of the 'tarfile' library, and does not define the 'tar' object which is used in the third line.对@THAVASI.T 的答案进行了较小的更正,其中省略了显示“tarfile”库的导入,并且没有定义第三行中使用的“tar”对象。

import tarfile

with tarfile.open("save.tar.gz","w:gz") as tar:
    for file in ["a.txt","b.log","c.png"]:
        tar.add(os.path.basename(file))

shutil.make_archive is very convenient for both files and directories (contents recursively added to the archive): shutil.make_archive对于文件和目录都非常方便(递归添加到存档中的内容):

import shutil

compressed_file = shutil.make_archive(
        base_name='archive',   # archive file name w/o extension
        format='gztar',        # available formats: zip, gztar, bztar, xztar, tar
        root_dir='path/to/dir' # directory to compress
)

Just restating @George V. Reilly 's excellent answer, but in a clearer form...只是重申@George V. Reilly 的出色回答,但形式更清晰......

import tarfile


fd_path="/some/folder/path/"
fl_name="some_file_name.ext"
targz_fd_path_n_fl_name="/some/folder/path/some_file_name.tar.gz"

with tarfile.open(targz_fd_path_n_fl_name, "w:gz") as tar:
    tar.add(fd_path + fl_name, fl_name)

As @Brōtsyorfuzthrāx pointed out (but in another way) if you leave the "add" method second argument then it'll give you the entire path structure of fd_path + fl_name in the tar file.正如@Brōtsyorfuzthrāx 所指出的(但以另一种方式),如果您将“add”方法的第二个参数保留下来,那么它将在 tar 文件中为您提供fd_path + fl_name的完整路径结构。

Of course you can use...当然你可以用...

import tarfile
import os

fd_path_n_fl_name="/some/folder/path/some_file_name.ext"
targz_fd_path_n_fl_name="/some/folder/path/some_file_name.tar.gz"

with tarfile.open(targz_fd_path_n_fl_name, "w:gz") as tar:
    tar.add(fd_path_n_fl_name, os.path.basename(fd_path_n_fl_name))

... if you don't want to use or don't have the folder path and file name separated. ...如果您不想使用或不想将文件夹路径和文件名分开。

Thanks!谢谢!

Perfect answer完美答案

best performance and without the .最佳性能且没有. and .. in compressed file!..在压缩文件中!

NOTICE (thanks MaxTruxa):注意(感谢 MaxTruxa):

this answer is vulnerable to shell injections.这个答案很容易受到 shell 注入的影响。 Please read the security considerations from the docs.请阅读文档中的安全注意事项 Never pass unescaped strings to subprocess.run , subprocess.call , etc. if shell=True .如果shell=True ,切勿将未转义的字符串传递给subprocess.runsubprocess.call等。 Use shlex.quote to escape (Unix shells only).使用shlex.quote转义(仅限 Unix shell)。

I'm using it locally - so it's good for my needs.我在本地使用它- 所以它对我的需求有好处。

subprocess.call(f'tar -cvzf {output_filename} *', cwd=source_dir, shell=True)

the cwd argument changes directory before compressing - which solves the issue with the dots. cwd参数在压缩之前更改目录 - 这解决了点的问题。

the shell=True allows wildcard usage ( * ) shell=True允许使用通配符( *

WORKS also for a directory recursively WORKS 也适用于递归目录

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM