简体   繁体   English

使用python ZipFile从zip中提取文件而不保留结构?

[英]Extract files from zip without keeping the structure using python ZipFile?

I try to extract all files from .zip containing subfolders in one folder.我尝试从一个文件夹中包含子文件夹的 .zip 中提取所有文件。 I want all the files from subfolders extract in only one folder without keeping the original structure.我希望子文件夹中的所有文件仅提取到一个文件夹中,而不保留原始结构。 At the moment, I extract all, move the files to a folder, then remove previous subfolders.目前,我提取所有文件,将文件移动到一个文件夹,然后删除以前的子文件夹。 The files with same names are overwrited.具有相同名称的文件将被覆盖。

Is it possible to do it before writing files?是否可以在写入文件之前做到这一点?

Here is a structure for example:下面是一个结构示例:

my_zip/file1.txt
my_zip/dir1/file2.txt
my_zip/dir1/dir2/file3.txt
my_zip/dir3/file4.txt

At the end I whish this:最后我希望这样:

my_dir/file1.txt
my_dir/file2.txt
my_dir/file3.txt
my_dir/file4.txt

What can I add to this code ?我可以在此代码中添加什么?

import zipfile
my_dir = "D:\\Download\\"
my_zip = "D:\\Download\\my_file.zip"

zip_file = zipfile.ZipFile(my_zip, 'r')
for files in zip_file.namelist():
    zip_file.extract(files, my_dir)
zip_file.close()

if I rename files path from zip_file.namelist(), I have this error:如果我从 zip_file.namelist() 重命名文件路径,则会出现以下错误:

KeyError: "There is no item named 'file2.txt' in the archive"

This opens file handles of members of the zip archive, extracts the filename and copies it to a target file (that's how ZipFile.extract works, without taking care of subdirectories).这将打开 zip 存档成员的文件句柄,提取文件名并将其复制到目标文件(这就是ZipFile.extract的工作方式,无需处理子目录)。

import os
import shutil
import zipfile

my_dir = r"D:\Download"
my_zip = r"D:\Download\my_file.zip"

with zipfile.ZipFile(my_zip) as zip_file:
    for member in zip_file.namelist():
        filename = os.path.basename(member)
        # skip directories
        if not filename:
            continue
    
        # copy file (taken from zipfile's extract)
        source = zip_file.open(member)
        target = open(os.path.join(my_dir, filename), "wb")
        with source, target:
            shutil.copyfileobj(source, target)

It is possible to iterate over the ZipFile.infolist() .可以迭代ZipFile.infolist() On the returned ZipInfo objects you can then manipulate the filename to remove the directory part and finally extract it to a specified directory.在返回的ZipInfo对象上,您可以操作filename以删除目录部分,最后将其提取到指定目录。

import zipfile
import os

my_dir = "D:\\Download\\"
my_zip = "D:\\Download\\my_file.zip"

with zipfile.ZipFile(my_zip) as zip:
    for zip_info in zip.infolist():
        if zip_info.filename[-1] == '/':
            continue
        zip_info.filename = os.path.basename(zip_info.filename)
        zip.extract(zip_info, my_dir)

Just extract to bytes in memory,compute the filename, and write it there yourself, instead of letting the library do it - -mostly, just use the "read()" instead of "extract()" method:只需提取到内存中的字节,计算文件名,然后自己写在那里,而不是让库来做——大多数情况下,只需使用“read()”而不是“extract()”方法:

Python 3.6+ update(2020) - the same code from the original answer, but using pathlib.Path , which ease file-path manipulation and other operations (like "write_bytes") Python 3.6+ 更新(2020) - 与原始答案相同的代码,但使用pathlib.Path ,可简化文件路径操作和其他操作(如“write_bytes”)

from pathlib import Path
import zipfile
import os

my_dir = Path("D:\\Download\\")
my_zip = my_dir / "my_file.zip"

zip_file = zipfile.ZipFile(my_zip, 'r')
for files in zip_file.namelist():
    data = zip_file.read(files, my_dir)
    myfile_path = my_dir / Path(files.filename).name
    myfile_path.write_bytes(data)
zip_file.close()

Original code in answer without pathlib:没有 pathlib的答案中的原始代码

import zipfile
import os

my_dir = "D:\\Download\\"
my_zip = "D:\\Download\\my_file.zip"

zip_file = zipfile.ZipFile(my_zip, 'r')
for files in zip_file.namelist():
    data = zip_file.read(files, my_dir)
    # I am almost shure zip represents directory separator
    # char as "/" regardless of OS, but I  don't have DOS or Windos here to test it
    myfile_path = os.path.join(my_dir, files.split("/")[-1])
    myfile = open(myfile_path, "wb")
    myfile.write(data)
    myfile.close()
zip_file.close()

A similar concept to the solution of Gerhard Götz , but adapted for extracting single files instead of the entire zip:Gerhard Götz 的解决方案类似的概念,但适用于提取单个文件而不是整个 zip:

with ZipFile(zipPath, 'r') as zipObj:
    zipInfo = zipObj.getinfo(path_in_zip))
    zipInfo.filename = os.path.basename(destination)
    zipObj.extract(zipInfo, os.path.dirname(os.path.realpath(destination)))

In case you are getting badZipFile error.如果您遇到 badZipFile 错误。 you can unzip the archive using 7zip sub process.您可以使用 7zip 子进程解压缩存档。 assuming you have installed the 7zip then use the following code.假设您已经安装了 7zip,然后使用以下代码。

import subprocess
my_dir = destFolder #destination folder
my_zip = destFolder + "/" + filename.zip #file you want to extract
ziploc = "C:/Program Files/7-Zip/7z.exe" #location where 7zip is installed
cmd = [ziploc, 'e',my_zip ,'-o'+ my_dir ,'*.txt' ,'-r' ] 
#extracting only txt files and from all subdirectories
sp = subprocess.Popen(cmd, stderr=subprocess.STDOUT, stdout=subprocess.PIPE)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从 zip 中提取文件而不使用 python zipfile 保留顶级文件夹 - Extract files from zip without keep the top-level folder with python zipfile Python zipfile从zip文件内的目录中提取文件 - Python zipfile extract files from directory inside a zip file 使用python zipfile提取zip子文件夹内的文件 - extract files inside zip sub folders with python zipfile 使用Python zipfile从ZIP中提取文件名中包含特定字符串的文件 - Extract file that contains specific string on filename from ZIP using Python zipfile 如何在不维护 Python 中的目录结构的情况下从 zip 中提取文件? - How to extract file from zip without maintaining directory structure in Python? Python:从 web 的 zipfile 中提取文件,无需先下载并保存 - Python : extract files from zipfile from web without downloading and saving it first 使用 Zipfile 模块 python 重命名写入 zip 文件的文件 - Rename files written to zip file using Zipfile module python 使用带有--py文件的.zip文件(使用zipfile包在python中创建)导入模块时出现问题 - Problem importing modules from a .zip file (created in python using zipfile package) with --py-files on an EMR in Spark 使用python提取zip文件 - extract zip files using python Python 从 zip 中提取文件 - Python extract files from zip
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM