Python IO Gurus: what are the differences between these two methods?

Question

I have two methods for writing binary files: the first works with data received by a server corresponding to a file upload (ie, handling a form whose enctype="multipart/form-data"), and the second works with file data sent as email attachments (ie, file data obtained by parsing an email message message body using get_payload()).

The odd thing is, they're not interchangeable: if I use the first one to save data parsed from an email attachment, it fails; similarly, the second function fails when dealing with uploaded file data.

What are the critical differences?

This is the first method:

def write_binary_file (folder, filename, f, chunk_size=4096):
    """Write the file data f to the folder and filename combination"""
    result = False
    if confirm_folder(folder):
        try:
            file_obj = open(os.path.join(folder, file_base_name(filename)), 'wb', chunk_size)
            for file_chunk in read_buffer(f, chunk_size):
                file_obj.write(file_chunk)
            file_obj.close()
            result = True
        except (IOError):
            print "file_utils.write_binary_file: could not write '%s' to '%s'" % (file_base_name(filename), folder)
    return result

This is the second method:

def write_binary_file (folder, filename, filedata):
    """Write the binary file data to the folder and filename combination"""
    result = False
    if confirm_folder(folder):
        try:
            file_obj = open(os.path.join(folder, file_base_name(filename)), 'wb')
            file_obj.write(filedata)
            file_obj.close()
            result = True
        except (IOError):
            print "file_utils.write_binary_file: could not write '%s' to '%s'" % (file_base_name(filename), folder)
    return result

Answer 1

The difference is that the HTTP upload method (the first one) - receives as its parameters the file-like object itself (the "f" variable) and creates a CGI module specific "read_buffer" to read data in chunks from that file object to copy them to the actual file.

Thsi can make sense in an http upload application,as it would allow the file copy to start while it is still uploading - I don't personaly think it would matter but for cases of several megabytes in upload, since your http response will be halted until all upload is done anyway, in a simple CGI script.

The other method receives "file_data" as parameters: allit has to do is write this data to a new file. (The other one has to read the data from a file-like-object and it still creates an intermediary object for that)

You can use the second method to save the HTTP data, just pass the kind of object it is expecting as its parameters, so, instead of calling the second function with the "f" arguemtn provided by your CGI field values, call it with "f.read() " -- this will cause all data to be read from the "f" file like object and the corresponding data to be seen by the method.

ie:

#second case:
write_binary_file(folder, filename, f.read() )

Answer 2

The first one probably expects a file-like object as a parameter, from which it reads the data. The second one expects that parameter to be a string with the actual data to be written.

To be sure you have to look at what your read_buffer function does.

Answer 3

The most obvious difference is the chunked reading of data. You don't specify the error, but I'm guessing that the chunked method fails in the call to read_buffer .

Python IO Gurus: what are the differences between these two methods?

Question

3 answers

solution1
2 2009-12-07 18:00:55

solution2
1 2009-12-07 17:54:36

solution3
0 2009-12-07 17:51:38

Python IO Gurus: what are the differences between these two methods?

Question

3 answers

solution1 2 2009-12-07 18:00:55

solution2 1 2009-12-07 17:54:36

solution3 0 2009-12-07 17:51:38

solution1
2 2009-12-07 18:00:55

solution2
1 2009-12-07 17:54:36

solution3
0 2009-12-07 17:51:38