简体   繁体   中英

extract files inside zip sub folders with python zipfile

i have a zip folder that contains files and child zip folders. I am able to read the files placed in the parent folder but how can i get to the files inside the child zip folders? here is my code to get the files inside the parent folder

from io import BytesIO
import pandas as pd
import requests
import zipfile
url1 = 'https://www.someurl.com/abc.zip'
r = requests.get(url1)
z = zipfile.ZipFile(BytesIO(r.content))    
temp  = pd.read_csv(z.open('mno.csv')

my question is, if lets say, I have a child sub folder

xyz.zip 

containing file

pqr.csv

how can I read this file

Use another BytesIO object to open the contained zipfile

from io import BytesIO
import pandas as pd
import requests
import zipfile

# Read outer zip file
url1 = 'https://www.someurl.com/abc.zip'
r = requests.get(url1)
z = zipfile.ZipFile(BytesIO(r.content))

# lets say the archive is:
#     zippped_folder/pqr.zip (which contains pqr.csv)

# Read contained zip file
pqr_zip = zipfile.ZipFile(BytesIO(z.open('zippped_folder/pqr.zip')))
temp = pd.read_csv(pqr_zip.open('prq.csv'))

After trying some permutation-combination, i hatched the problem with this code

zz = zipfile.ZipFile(z.namelist()[i])
temp2  = pd.read_csv(zz.open('pqr.csv'))
# where i is the index position of the child zip folder in the namelist() list. In this case, the 'xyz.zip' folder

# for eg if the 'xyz.zip' folder was third in the list, the command would be:
zz = zipfile.ZipFile(z.namelist()[2])

alternatively, if the index position is not known, the same can be achieved like this:

zz  = zipfile.ZipFile(z.namelist()[z.namelist().index('xyz.zip')])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM