简体   繁体   English

如何在 Python 中读取压缩文件夹内文件夹中的文件

[英]How to read files in a folder within a zipped folder in Python

I have a zipped folder with me which contains a subfolder within it and the subfolder has around 60000+ images within it.我有一个压缩文件夹,其中包含一个子文件夹,该子文件夹中有大约 60000 多个图像。 I was wondering if there is a way to read all the images within the subfolder without extracting it (As the size of the image folders is ~ 100GB).我想知道是否有办法读取子文件夹中的所有图像而不提取它(因为图像文件夹的大小约为 100GB)。

I was thinking of using zipfile package within python.However I will not be able to use open function within the module since I don't know how to iterate through the whole sub-folder.我正在考虑在 python 中使用 zipfile 包。但是我将无法在模块中使用 open 函数,因为我不知道如何遍历整个子文件夹。 It will be great if you could kindly provide me any inputs on how to do this如果您能向我提供有关如何执行此操作的任何意见,那就太好了

with zipfile.ZipFile("/home/diliptmonson/abc.zip","r") as zip_ref:
    train_images=zip_ref.open('train/86760c00-21bc-11ea-a13a-137349068a90.jpg')```

You may use the following solution:您可以使用以下解决方案:

  • Open the zip file, and iterate the content as described here .打开 zip 文件,并按照此处所述迭代内容。
  • Verify file extension is .jpg .验证文件扩展名是.jpg
  • Read image binary data of specific element (file within folder) from zip.从 zip 读取特定元素(文件夹内的文件)的图像二进制数据。
  • Decode the binary data to image using cv2.imdecode .使用cv2.imdecode将二进制数据解码为图像。

Here is the code:这是代码:

from zipfile import ZipFile
import numpy as np
import cv2
import os

# https://thispointer.com/python-how-to-get-the-list-of-all-files-in-a-zip-archive/
with ZipFile("abc.zip", "r") as zip_ref:
   # Get list of files names in zip
   list_of_files = zip_ref.namelist()

   # Iterate over the list of file names in given list & print them
   for elem in list_of_files:
       #print(elem)
       ext = os.path.splitext(elem)[-1]  # Get extension of elem

       if ext == ".jpg":
           # Read data in case extension is ".jpg"
           in_bytes = zip_ref.read(elem)

           # Decode bytes to image.
           img = cv2.imdecode(np.fromstring(in_bytes, np.uint8), cv2.IMREAD_COLOR)

           # Show image for testing
           cv2.imshow('img', img)
           cv2.waitKey(1000)

cv2.destroyAllWindows()

Use a for-loop:使用 for 循环:

# namelist lists all files
for file in zip_ref.namelist():
   opened_file = zip_ref.open(file)
   # do stuff with your file 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在 Python 中读取压缩文件夹中的文本文件 - How to read text files in a zipped folder in Python 您如何(逐行)读取 Python 中压缩文件夹内的多个.gz 文件而不创建临时文件? - How do you (line by line) read multiple .gz files that are inside a zipped folder in Python without creating temporary files? 使用Python 2.7.5将文件夹中的所有压缩文件解压缩到同一文件夹 - Unzip all zipped files in a folder to that same folder using Python 2.7.5 如何在python中读取文件夹中的txt文件列表 - how to read a list of txt files in a folder in python 如何在 Python 中读取文件夹中的特定文件(范围内的文件) - How can i read specific files in a folder (files within a range)in Python 如何在python中的文件夹中读取某些csv文件 - How to read some csv files in a folder in python 如何打开和读取文件夹python中的文本文件 - How to open and read text files in a folder python 重命名文件夹中的文件 - Python - Rename files within a folder - Python 使用python将目录内的压缩文件夹中的所有文件提取到其他目录而不使用文件夹 - Extract all files from a zipped folder inside a directory to other directory without folder using python 循环并加载 yaml 文件的压缩文件夹 - loop through and load a zipped folder of yaml files
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM