Python 2.7 ZIP archive broken when sending as HTTP response

Question

I am running a Python 2.7 script as CGI on Apache 2.4 on Win 10, the sctipt sends a ZIP archive as download in HTTP response. I followed this thread How to deploy zip files (or other binaries) trough cgi in Python? , but keep getting a broken ZIP file. I hope someone can help as I've been trying to resolve this for 2 days, cannot find any info on this behavior.

Demo Script:

import cgi, cgitb, os
import shutil

cgitb.enable()

out_path = os.path.dirname(__file__) + "\\tmp_uploads\\test2.zip"  
            
# send output zip as download
import sys
print "Content-Disposition: attachment; filename=\"test2.zip\""
print "Content-Type: application/zip"
print

##sys.stdout.flush()

with open(out_path,'rb') as zf:
    shutil.copyfileobj(zf, sys.stdout)
##    print zf.read()

Enabling sys.stdout.flush() or using print zf.read() instead of shutil.copyfileobj(zf, sys.stdout) makes no difference.

Original ZIP file is intact:

Downloaded archive is broken:

Answer 1

I have the same problem, Python 2.7.18 script as CGI on a lighttpd webserver on Win10. I compared the downloaded zip file with the original and found the problem. Python automatically converts all \\n to \\r\\n in stdout. The only way to Prevent Python print()'s automatic newline conversion to CRLF on Windows I found is to use sys.stdout.buffer, which is not available in Python 2.7.

Update: Turning off the buffering is the key, as answered by Dalen. I found another way to turn it off if you don't want to set a system specific shebang:

msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY)

Answer 2

First of all, if you want Windows to transfer any longer, especially binary, content using PIPEs, in this case, STDOUT, when layered over more PIPEs (Apache CGI), then you have to turn of buffering completely, and take control over it yourself. So your script's execution line should look like:

#!C:\Python27\python.exe -u

Secondly, you have to use flush() and give the content in portions, ie buffer the thing, otherwise, if the file you are sending is big either HTTP server will consider your script as frozen and kill it, sending the timeout or internal server error response to client, or the client will terminate the connection while waiting too long for the response. So the STDOUT PIPE has to be lively.

data = zf.read(8192)
while data:
    sys.stdout.write(data)
    sys.stdout.flush()
    data = zf.read(8192)

And lastly. Do declare the Content-Transfer-Encoding header to be "binary". It helps. Also, it is advisable to provide Content-Length header as the output of the script is a continuous stream and Apache wouldn't have any idea how much bytes will it contain untill the end of the script, at which point it is too late as all headers are already sent.

Python 2.7 ZIP archive broken when sending as HTTP response

Question

2 answers

solution1
1 2021-06-28 23:11:31

solution2
0 2021-06-29 00:03:29

Python 2.7 ZIP archive broken when sending as HTTP response

Question

2 answers

solution1 1 2021-06-28 23:11:31

solution2 0 2021-06-29 00:03:29

solution1
1 2021-06-28 23:11:31

solution2
0 2021-06-29 00:03:29