如何将一些tar.gz文件解压缩到一个目录？

Question

Im trying to extract a number of tar.gz files with no success. 我试图提取一些tar.gz文件但没有成功。

Ive tried to modify a code I was using to extract zip files. 我试图修改我用来提取zip文件的代码。 Below is my file structure, files and some code. 下面是我的文件结构，文件和一些代码。

File Structure: 文件结构：

D:\\Test\\Tar

File Names: 文件名：

DZB1212-500258L004001_4.tgz
DZB1213-500119L002001_2.tgz
DZB1213-500119L006001_6.tgz

Code I've tried: 代码我尝试过：

import glob
import os
import re
import tarfile
import gzip
import shutil
os.chdir('E:\\SPRING2019\\SILKROAD\\Folder_Extraction_Auto\\SRTM_DEMs\\TESTEXTRACTER3\\USGS_Declassified\\Declass2_2002')

#set up pathing
tarfile_rootdir = ('E:\\SPRING2019\\SILKROAD\\Folder_Extraction_Auto\\SRTM_DEMs\\TESTEXTRACTER3\\USGS_Declassified\\Declass2_2002')
extract_rootdir = ('E:\\SPRING2019\\SILKROAD\\Folder_Extraction_Auto\\TEST')

#process the zip files [a-zA-Z] to [\w] and removed the _ seperating the two WORKED!!!!!!!!!!!!
re_pattern = re.compile(r'\A([\w+]*)')
#CHANGED ABOVE CREATED HTO_O with no subfolers but all extracted
for tar_file in glob.iglob(os.path.join(tarfile_rootdir, '*.tar.gz')):
    part = re.findall(re_pattern, os.path.basename(tar_file))[0]
    part = [item.upper() for item in part]
    folder = {'outer': '{0}{1}{2}{3}'.format(*part), 'inner': '{0}{1}{2}{3}'.format(*part)}
    extract_path = os.path.join(extract_rootdir, folder['outer'])
    with tarfile.open(tar_file, 'r:gz') as tarfile:
        tar_file.extractall(extract_path)

It will run, but nothing happens. 它会运行，但没有任何反应。

Answer 1

import glob, os, re, tarfile

# Setup main paths.
tarfile_rootdir = r'D:\SPRING2019\Tarfiles'
extract_rootdir = r'D:\SPRING2019\Test'

# Process the files.
re_pattern = re.compile(r'\A(\w+)-\d+[a-zA-Z]0{0,5}(\d+)')

for tar_file in glob.iglob(os.path.join(tarfile_rootdir, '*.tgz')):

    # Get the parts from the base tgz filename using regular expressions.
    part = re.findall(re_pattern, os.path.basename(tar_file))[0]

    # Build the extraction path from each part.
    extract_path = os.path.join(extract_rootdir, *part)

    # Perform the extract of all files from the zipfile.
    with tarfile.open(tar_file, 'r:gz') as r:
        r.extractall(extract_path)

This code is based similar to the answer to your last question. 此代码与您上一个问题的答案类似。 Due to uncertain information on directory structure, I will provide a structure as an example. 由于目录结构信息不确定，我将以结构为例。

TGZ files in D:\\SPRING2019\\Tarfiles : D:\\SPRING2019\\Tarfiles TGZ文件：

 DZB1216-500058L002001.tgz DZB1216-500058L003001.tgz

Extract directory structure in D:\\SPRING2019\\Test : 在D:\\SPRING2019\\Test提取目录结构：

 DZB1216 2001 3001

The .tgz file paths are retrieved with glob . 使用glob检索.tgz文件路径。

From example filename: DZB1216-500058L002001.tgz , the regular expression will capture 2 groups: 从示例文件名： DZB1216-500058L002001.tgz ，正则表达式将捕获2组：

\\A is an anchor at the start of the string. \\A是字符串开头的锚点。
This is not a group . 这不是一个群体 。
(\\w+) to match DZB1216 . (\\w+)匹配DZB1216 。
This is the 1st group . 这是第一组 。
-\\d+[a-zA-Z]0{0,5} matches up to the next group. -\\d+[a-zA-Z]0{0,5}匹配下一组。
This is not a group . 这不是一个群体 。
(\\d+) to match 2001 . (\\d+)匹配2001 。
This is the 2nd group . 这是第二组 。

The extraction path is joined using the values of extract_rootdir , DZB1216 , and 2001 . 使用extract_rootdir ， DZB1216和2001的值连接提取路径。 This results in D:\\SPRING2019\\Test\\DZB1216\\2001 as the extraction path. 这导致D:\\SPRING2019\\Test\\DZB1216\\2001作为提取路径。

The use of tarfile will extract all from the .tgz file. tarfile的使用将从.tgz文件中提取所有内容。

Answer 2

看起来你的文件名是* .tgz，但你的glob是* .tar.gz！

如何将一些tar.gz文件解压缩到一个目录？

问题描述

2 个解决方案

解决方案1
2 已采纳 2019-06-14 02:00:58

解决方案2
1 2019-06-12 17:17:05

如何将一些tar.gz文件解压缩到一个目录？

问题描述

2 个解决方案

解决方案1 2 已采纳 2019-06-14 02:00:58

解决方案2 1 2019-06-12 17:17:05

解决方案1
2 已采纳 2019-06-14 02:00:58

解决方案2
1 2019-06-12 17:17:05