简体   繁体   English

如何从tar读取gz压缩文件

[英]How to read gz compressed files from tar

Let's say we have a tar file which in turn contains multiple gzip compressed files. 假设我们有一个tar文件,该文件又包含多个gzip压缩文件。 I want to be able to read the contents of those gzip files without compressing either the tar file or the individual gzip files. 我希望能够读取这些gzip文件的内容而无需压缩tar文件或单个gzip文件。 I 'm trying to use tarfile module in python. 我正在尝试在python中使用tarfile模块。

This might work, I haven't tested it, but this has the main ideas, and related tools. 这可能有效,但我没有测试过,但这具有主要思想和相关工具。 It iterates over the files in the tar, and if they are gzipped, then will read them into the file_contents variable: 遍历tar中的文件,如果将它们压缩,则将它们读入file_contents变量:

import tarfile as t
import gzip as g 
for member in t.open("your.gz.tar").getmembers():
    fo=t.extractfile(member)
    file_contents = g.GzipFile(fileobj=fo).read()

note: if the file is too large for memory, then consider looking into a streamed reader (chunk by chunk) as linked. 注意:如果文件太大而无法存储,请考虑以链接方式查看流式读取器(逐块)。

If you have additional logic based on what the member (TarInfo) object looks like you can use these: 如果根据成员(TarInfo)对象的外观有其他逻辑,则可以使用以下逻辑:

see: 看到:

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM