简体   繁体   English

如何从zip文件复制到文件夹而不解压缩?

[英]How to copy from zip file to a folder without unzipping it?

How to make this code works?如何使此代码有效? There is a zip file with folders and .png files in it.有一个包含文件夹和 .png 文件的 zip 文件。 Folder ".\\icons_by_year" is empty.文件夹“.\\icons_by_year”是空的。 I need to get every file one by one without unzipping it and copy to the root of the selected folder (so no extra folders made).我需要一个一个地获取每个文件而不解压缩它并复制到所选文件夹的根目录(因此没有额外的文件夹)。

class ArrangerOutZip(Arranger):
    def __init__(self):
        self.base_source_folder = '\\icons.zip'
        self.base_output_folder = ".\\icons_by_year"

    def proceed(self):
        self.create_and_copy()

    def create_and_copy(self):
        reg_pattern = re.compile('.+\.\w{1,4}$')
        f = open(self.base_source_folder, 'rb')
        zfile = zipfile.ZipFile(f)
        for cont in zfile.namelist():
            if reg_pattern.match(cont):
                with zfile.open(cont) as file:
                    shutil.copyfileobj(file, self.base_output_folder)
        zfile.close()
        f.close()


arranger = ArrangerOutZip()
arranger.proceed()

I'd take a look at the zipfile libraries' getinfo() and also ZipFile.Path() for construction since the constructor class can also use paths that way if you intend to do any creation.我会查看zipfile库的 getinfo() 和 ZipFile.Path() 以进行构造,因为如果您打算进行任何创建,构造函数类也可以以这种方式使用路径。

Specifically PathObjects .特别是PathObjects This is able to do is to construct an object with a path in it, and it appears to be based on pathlib.这能做的就是构造一个包含路径的对象,而且它似乎是基于pathlib的。 Assuming you don't need to create zipfiles, you can ignore this ZipFile.Path()假设你不需要创建 zipfiles,你可以忽略这个 ZipFile.Path()

However, that's not exactly what I wanted to point out.然而,这并不是我想要指出的。 Rather consider the following:而是考虑以下几点:

zipfile.getinfo() zipfile.getinfo()

There is a person who I think is getting at this exact situation here:我认为有人在这里遇到了这种情况:

https://www.programcreek.com/python/example/104991/zipfile.getinfo https://www.programcreek.com/python/example/104991/zipfile.getinfo

This person seems to be getting a path using getinfo().此人似乎正在使用 getinfo() 获取路径。 It's also clear that NOT every zipfile has the info.也很明显,并非每个 zipfile 都有信息。

shutil.copyfileobj uses file objects for source and destination files. shutil.copyfileobj使用文件对象作为源文件和目标文件。 To open the destination you need to construct a file path for it.要打开目标,您需要为其构建文件路径。 pathlib is a part of the standard python library and is a nice way to handle file paths. pathlib是标准 python 库的一部分,是处理文件路径的好方法。 And ZipFile.extract does some of the work of creating intermediate output directories for you (plus sets file metadata) and can be used instead of copyfileobj . ZipFile.extract会为您完成一些创建中间输出目录的工作(加上设置文件元数据),并且可以代替copyfileobj

One risk of unzipping files is that they can contain absolute or relative paths outside of the target directory you intend (eg, "../../badvirus.exe").解压缩文件的风险之一是它们可能包含您想要的目标目录之外的绝对或相对路径(例如,“../../badvirus.exe”)。 extract is a bit too lax about that - putting those files in the root of the target directory - so I wrote a little something to reject the whole zip if you are being messed with. extract对此有点太松散了 - 将这些文件放在目标目录的根目录中 - 所以我写了一些东西来拒绝整个 zip,如果你被弄乱了。

With a few tweeks to make this a testable program,用几个星期的时间使它成为一个可测试的程序,

from pathlib import Path
import re
import zipfile
#import shutil

#class ArrangerOutZip(Arranger):
class ArrangerOutZip:
    def __init__(self, base_source_folder, base_output_folder):
        self.base_source_folder = Path(base_source_folder).resolve(strict=True)
        self.base_output_folder = Path(base_output_folder).resolve()

    def proceed(self):
        self.create_and_copy()

    def create_and_copy(self):
        """Unzip files matching pattern to base_output_folder, raising
        ValueError if any resulting paths are outside of that folder.
        Output folder created if it does not exist."""
        reg_pattern = re.compile('.+\.\w{1,4}$')
        with open(self.base_source_folder, 'rb') as f:
            with zipfile.ZipFile(f) as zfile:
                wanted_files = [cont for cont in zfile.namelist()
                    if reg_pattern.match(cont)]
                rebased_files = self._rebase_paths(wanted_files,
                    self.base_output_folder)
                for cont, rebased in zip(wanted_files, rebased_files):
                    print(cont, rebased, rebased.parent)
                    # option 1: use shutil
                    #rebased.parent.mkdir(parents=True, exist_ok=True)
                    #with zfile.open(cont) as file, open(rebased, 'wb') as outfile:
                    #    shutil.copyfileobj(file, outfile)
                    # option 2: zipfile does the work for you
                    zfile.extract(cont, self.base_output_folder)
                    
    @staticmethod
    def _rebase_paths(pathlist, target_dir):
        """Rebase relative file paths to target directory, raising
        ValueError if any resulting paths are not within target_dir"""
        target = Path(target_dir).resolve()
        newpaths = []
        for path in pathlist:
            newpath = target.joinpath(path).resolve()
            newpath.relative_to(target) # raises ValueError if not subpath
            newpaths.append(newpath)
        return newpaths


#arranger = ArrangerOutZip('\\icons.zip', '.\\icons_by_year')

import sys
try:
    arranger = ArrangerOutZip(sys.argv[1], sys.argv[2])
    arranger.proceed()
except IndexError:
    print("usage: test.py zipfile targetdir")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM