I have a zipped folder with me which contains a subfolder within it and the subfolder has around 60000+ images within it. I was wondering if there is a way to read all the images within the subfolder without extracting it (As the size of the image folders is ~ 100GB).
I was thinking of using zipfile package within python.However I will not be able to use open function within the module since I don't know how to iterate through the whole sub-folder. It will be great if you could kindly provide me any inputs on how to do this
with zipfile.ZipFile("/home/diliptmonson/abc.zip","r") as zip_ref:
train_images=zip_ref.open('train/86760c00-21bc-11ea-a13a-137349068a90.jpg')```
You may use the following solution:
.jpg
.cv2.imdecode
.Here is the code:
from zipfile import ZipFile
import numpy as np
import cv2
import os
# https://thispointer.com/python-how-to-get-the-list-of-all-files-in-a-zip-archive/
with ZipFile("abc.zip", "r") as zip_ref:
# Get list of files names in zip
list_of_files = zip_ref.namelist()
# Iterate over the list of file names in given list & print them
for elem in list_of_files:
#print(elem)
ext = os.path.splitext(elem)[-1] # Get extension of elem
if ext == ".jpg":
# Read data in case extension is ".jpg"
in_bytes = zip_ref.read(elem)
# Decode bytes to image.
img = cv2.imdecode(np.fromstring(in_bytes, np.uint8), cv2.IMREAD_COLOR)
# Show image for testing
cv2.imshow('img', img)
cv2.waitKey(1000)
cv2.destroyAllWindows()
Use a for-loop:
# namelist lists all files
for file in zip_ref.namelist():
opened_file = zip_ref.open(file)
# do stuff with your file
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.