I'm using the code below to extract .tgz
files. The type of log files ( .tgz
) that I need to extract have sub-directories that have other .tgz
files and .tar
files inside them. I want to extract those too.
Ultimately, I'm trying to search for certain strings in all .log
files and .txt
files that may appear in a .tgz
file.
Below is the code that I'm using to extract the .tgz
file. I've been trying to work out how to extract the sub-files ( .tgz
and .tar
). So far, I've been unsuccessful.
import os, sys, tarfile
try:
tar = tarfile.open(sys.argv[1] + '.tgz', 'r:gz')
for item in tar:
tar.extract(item)
print 'Done.'
except:
name = os.path.basename(sys.argv[0])
print name[:name.rfind('.')], '<filename>'
This should give you the desired result:
import os, sys, tarfile
def extract(tar_url, extract_path='.'):
print tar_url
tar = tarfile.open(tar_url, 'r')
for item in tar:
tar.extract(item, extract_path)
if item.name.find(".tgz") != -1 or item.name.find(".tar") != -1:
extract(item.name, "./" + item.name[:item.name.rfind('/')])
try:
extract(sys.argv[1] + '.tgz')
print 'Done.'
except:
name = os.path.basename(sys.argv[0])
print name[:name.rfind('.')], '<filename>'
As @cularis said this is called recursion.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.