简体   繁体   中英

Adding a BytesIO to a BytesIO tar.gz in python

I'm having trouble writing a .tar.gz file in Python from a BytesIO Object. Writing just a plain tarfile works great, but if I change the write mode to be .tar.gz (or bz, or xz) it doesn't produce a valid tar file.

I've made a stripped down version below:

def string_to_tarfile(name, string):
    encoded = string.encode('utf-8')
    s = BytesIO(encoded)

    tar_info = tarfile.TarInfo(name=name)
    tar_info.mtime=time.time()
    tar_info.size=len(encoded)

    return s, tar_info

file1='hello'
file2='world'

f=BytesIO()
tar = tarfile.open(fileobj=f, mode='w:gz')
string, tar_info = string_to_tarfile("file1.txt", file1)
tar.addfile(tarinfo=tar_info, fileobj=string)

string, tar_info = string_to_tarfile("file2.txt", file2)
tar.addfile(tarinfo=tar_info, fileobj=string)

f.seek(0)
with open('whatevs.tar.gz', 'wb') as out:
    out.write(f.read())

What this should do is make a whatevs.tar.gz file with "file1.txt" and "file2.txt" in it.

If I replace 'w:gz' with 'w' (and remove the .gz ending) I get a tarfile with the correct contents, but adding it back results in a 10 byte, corrupt tar.gz file

I want to write this to a bytesio because I am actually uploading it to S3.

I'm not sure if I'm grossly misreading the docs here, I've looked through a million posts and they either make tar files (which works fine, but I don't want) or write to the local file system (again, I'm uploading to S3, I don't want to write it locally).

Thank you!

I think closing the tarfile object will solve your problem.

f = BytesIO()
tar = tarfile.open(fileobj=f, mode='w:gz')
string, tar_info = string_to_tarfile("file1.txt", file1)
tar.addfile(tarinfo=tar_info, fileobj=string)

string, tar_info = string_to_tarfile("file2.txt", file2)
tar.addfile(tarinfo=tar_info, fileobj=string)
tar.close() # <-- 

To not experience these kind of open file problems, I think it is safer to use it with with statement like this:

f = BytesIO()
with tarfile.open(fileobj=f, mode='w:gz') as tar:
    string, tar_info = string_to_tarfile("file1.txt", file1)
    tar.addfile(tarinfo=tar_info, fileobj=string)

    string, tar_info = string_to_tarfile("file2.txt", file2)
    tar.addfile(tarinfo=tar_info, fileobj=string)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM