简体   繁体   中英

With Python's 'tarfile', how can I get the top-most directory in a tar archive?

I am wanting to upload a theme archive to a django web module and wanting to pull the name of the top-most directory in the archive to use as the theme's name. The archive will always be a tar-gzip format and will always have only one folder at the top level (though other files may exist parallel to it) with the various sub-directories containing templates, css, images etc. in what ever order suits the theme best.

Currently, based on the very useful code from MegaMark16, my tool uses the following method:

f = tarfile.open(fileobj=self.theme_file, mode='r:gz')
self.name = f.getnames()[0]

Where self.theme_file is a full path to the uploaded file. This works fine as long as the order of the entries in the tarball happens to be correct, but in many cases it is not. I can certainly loop through the entire archive and manually check for the proper 'name' characteristics, but I suspect that there is a more elegant and rapid approach. Any suggestions?

You'll want to use a method called commonprefix .

Sample code would be something to the effect of:

archive = tarfile.open(filepath, mode='r')
print os.path.commonprefix(archive.getnames())

Where the printed value would be the 'topmost directory in the archive'--or, your theme name.

Edit: upon further reading of your specs, though, this approach may not yield your desired result if you have files that are siblings to the 'topmost directory', as the common prefix would then just be . ; this would only work if ALL files, indeed, had that common prefix of your theme name.

All sub directories have a '/' so you can do something like this

self.name = [name for name in f.getnames() if '/' not in name][0] and optimize with other tricks.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM