Dan Ward Follow me on Twitter View my LinkedIn profile Subscribe to my news feed   

A self-confessed geek and web developer

 

Blog: A custom MongoDb/GridFS file wrapper for the Django HttpResponse object

From time-to-time, I'll have a brainwave while trying to figure out something tech-oriented; well, either that or I just want to get something off my chest. Anyhow, if I get round to typing it up, it'll wind up here in the blog.

Created by dan on Monday 22nd February, 2010 at 9:28 PM
Tags: Web development, Django, Python, MongoDB, GridFS

Until today, I had been using Django's FileWrapper class to iterate over an open GridFS and yeild chunks to an HttpResponse object, however I noticed that files over 8 KB were being reiterated in the output.

As far as I could fathom, the reason why this had been occurring is because Django's FileWrapper class only reads 8 KB (default) chunks per yield to the calling HttpResponse object, however it became apparent to me that an open GridFile does not seek forward in the file by the amount read in the previous yield, instead this has to be done manually.

I came up with the below function to use as a file wrapper for a GridFile. It uses the chunk size set by GridFile.chunk_size (256 K per yield for my files), although it can be set to somethink like 8 KB (8192 B) per yield if so required.

# Define file output iterator
def gridFSWrapper(filedata):
    
    # Attempt to yield file data
    try:
        
        # Set read seek position
        seeker = 0
        
        # Set bytes read per yield to the GridFile's chunk size
        bytes_per_yield = filedata.chunk_size
        
        # Iterate over file data
        while True:
            
            # Attempt to yield file chunks to HttpResponse object
            try:
                
                # If the current position in the file is less than or equal to the file size
                if filedata.tell() <= filedata.length:
                    
                    # Yield chunk from file (relative to seek position)
                    yield filedata.read(bytes_per_yield)
                    
                    # Increment seek position by bytes read per yield
                    seeker = seeker + bytes_per_yield
                    
                    # Set new seek position in file
                    filedata.seek(seeker)
                    
                # If the current position in the file has exceeded the file size
                else:
                
                    # Break the current iteration
                    break
            
            # Keep quiet about GeneratorExit exceptions
            except GeneratorExit:
                pass
            
            # Handle file chunk iteration exceptions
            except Exception, e:
                
                # Re-raise the exception
                raise Exception, e
                
                # Break the current iteration
                break
        
        # Close the file
        filedata.close()
        
    # Handle exceptions raised in previous statement (kept for debugging purposes)
    except Exception, e:
    
        # Re-raise the exception
        raise Exception, e

To use the wrapper, the following lines would work with an open GridFile (declared as 'fdata'):

response = HttpResponse(gridFSWrapper(fdata), mimetype=fdata.content_type)
response['Content-Disposition'] = 'attachment'
response['Content-Length'] = fdata.length
return response

For more information on GridFile methods/properties with Python/PyMongo, see:
http://api.mongodb.org/python/1.4%2B/api/gridfs/grid_file.html#gridfs.grid_file.GridFile