简体   繁体   中英

Python: Crossplatform code to download a valid .zip file

I have a requirement to download and unzip a file from a website. Here is the code I'm using:

    #!/usr/bin/python

    #geoipFolder = r'/my/folder/path/ '     #Mac/Linux folder path
    geoipFolder = r'D:\my\folder\path\ '    #Windows folder path
    geoipFolder = geoipFolder[:-1]          #workaround for Windows escaping trailing quote
    geoipName   = 'GeoIPCountryWhois'
    geoipURL    = 'http://geolite.maxmind.com/download/geoip/database/GeoIPCountryCSV.zip'

    import urllib2
    response = urllib2.urlopen(geoipURL)

    f = open('%s.zip' % (geoipFolder+geoipName),"w")
    f.write(repr(response.read()))
    f.close()

    import zipfile  
    zip = zipfile.ZipFile(r'%s.zip' % (geoipFolder+geoipName))
    zip.extractall(r'%s' % geoipFolder)

This code works on Mac and Linux boxes, but not on Windows. There, the .zip file is written, but the script throws this error:

    zipfile.BadZipfile: File is not a zip file

I can't unzip the file using Windows Explorer either. It says that:

    The compressed (zipped) folder is empty.

However the file on disk is 6MB large.

Thoughts on what I'm doing wrong on Windows?

Thanks

Your zipfile is corrupt on windows because you're opening the file in write/text mode (line-terminator conversion trashes binary data):

f = open('%s.zip' % (geoipFolder+geoipName),"w")

You have to open in write/binary mode like this:

f = open('%s.zip' % (geoipFolder+geoipName),"wb")

(will still work on Linux of course)

To sum it up, a more pythonic way of doing it, using a with block (and remove repr ):

with open('{}{}.zip'.format(geoipFolder,geoipName),"wb") as f:
     f.write(response.read())

EDIT: no need to write a file to disk, you can use io.BytesIO , since the ZipFile object accepts a file handle as first parameter.

import io
import zipfile  

with open('{}{}.zip'.format(geoipFolder,geoipName),"wb") as f:
    outbuf = io.BytesIO(f.read())

zip = zipfile.ZipFile(outbuf)  # pass the fake-file handle: no disk write, no temp file
zip.extractall(r'%s' % geoipFolder)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM