简体   繁体   中英

How to read gz compressed files from tar

Let's say we have a tar file which in turn contains multiple gzip compressed files. I want to be able to read the contents of those gzip files without compressing either the tar file or the individual gzip files. I 'm trying to use tarfile module in python.

This might work, I haven't tested it, but this has the main ideas, and related tools. It iterates over the files in the tar, and if they are gzipped, then will read them into the file_contents variable:

import tarfile as t
import gzip as g 
for member in t.open("your.gz.tar").getmembers():
    fo=t.extractfile(member)
    file_contents = g.GzipFile(fileobj=fo).read()

note: if the file is too large for memory, then consider looking into a streamed reader (chunk by chunk) as linked.

If you have additional logic based on what the member (TarInfo) object looks like you can use these:

see:

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM