简体   繁体   English

如何将 zip 文件的子集导入 colab?

[英]How to import a subset of a zip file into colab?

I have a very big zip file in my google drive which contain several subfloders.我的谷歌驱动器中有一个非常大的 zip 文件,其中包含几个子文件夹。 Now, I'd like to extract only a few subfolders (not all folder into colab).现在,我只想提取几个子文件夹(不是将所有文件夹都提取到 colab 中)。 Is there any way for this?有什么办法吗?

For instance, suppose the zip file name is "MyBigFile.zip" which contain "folder1", "folder2", "folder3", "folder4", and "folder5".例如,假设 zip 文件名为“MyBigFile.zip”,其中包含“folder1”、“folder2”、“folder3”、“folder4”和“folder5”。 I only want to import and extract "folder1",and "folder4" into my google colab (and better import only 200 images from it only).我只想将“folder1”和“folder4”导入并提取到我的 google colab 中(最好只从中导入 200 张图像)。 How is it possible?这怎么可能? any suggestion?有什么建议吗?

*if this is related: each folder 1-5 contains around 50000.png files *如果这是相关的:每个文件夹 1-5 包含大约 50000.png 文件

After some searching I found something.经过一番搜索,我发现了一些东西。 You can use the zipfile module in google collab too.您也可以在 google collab 中使用zipfile模块。

from zipfile import ZipFile
from google.colab import drive

drive.mount('/content/drive')

zipfile = ZipFile("Zip File Path") # MyBigFile.zip
def extract(folderName, numberOfFiles):
    files = list(filter(lambda x: x.startswith(folderName), zipfile.namelist()))[:numberOfFiles]
    for file in files:
        zipfile.extract(file, 'Output Folder Path')  # extractedFolder
 
extract("folder1/", 200)
zipfile.close()

You can remove google.colab code why mounting drive manually clicks on this button.您可以删除google.colab代码,为什么手动安装驱动器会点击此按钮。

在此处输入图像描述 在此处输入图像描述

After That, you can remove these two lines of code.之后,您可以删除这两行代码。

from zipfile import ZipFile
# from google.colab import drive

# drive.mount('/content/drive')

zipfile = ZipFile("MyBigFile.zip")
def extract(folderName, numberOfFiles):
    files = list(filter(lambda x: x.startswith(folderName), zipfile.namelist()))[:numberOfFiles]
    for file in files:
        zipfile.extract(file, 'extractedFolder')

extract("folder1/", 200)
zipfile.close()

You need to mount your Google Drive through Colab first:您需要先通过 Colab 安装您的 Google Drive:

from google.colab import drive
drive.mount('/content/drive')

Now unzip only specific folders where you want them:现在仅将特定文件夹解压缩到您想要的位置:

!unzip /path_to/MyBigFile.zip 'folder1/*' -d /path_to_unzip

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM