File Downloads Done Right »

Created at: 22.02.2009 04:44, source: The Rails Way - Home, tagged: apache downloads files

Getting your file downloads right is one of the most important parts of your File Management functionality. A poorly implemented download function can make your application painful to use, not just for downloaders, but for everyone else too.

Thankfully it’s also one of the easiest things to get right.

The simple version

For the purposes of this article let’s assume that your application needs to provide access to a large zip file, but that access should be restricted to logged in users.

The first choice we have to make is where to store this file. In this case there’s really only one wrong answer, and that’s to store it in the public folder of your rails application. Every file stored in public will be served by our webserver without the involvement of our rails application. This makes it impossible for us to check that the user has logged in. Unless your files are completely public, you shouldn’t go anywhere near the public folder.

So let’s assume we’ve stored the zip file in:

/home/railsway/downloads/huge.zip

Next we need a simple download action to send the file to the user, thankfully rails has this built right in:

  before_filter :login_required
  def download
    send_file '/home/railsway/downloads/huge.zip', :type=>"application/zip" 
  end

Now when our users click the download link, they’ll be asked to choose a location and then be able to view the file. The bad news is, there’s a catch here. The good news is it’s easy to fix.

What’s the catch?

The problem here is one of scarce resources, and that resource is your rails processes. Whether you’re using mongrel, fastcgi or passenger you have a limited number of rails processes available to handle application requests. When one of your users makes a request, you want to know that you either have a process free to handle the request, or that one will become free in short order. If you don’t, users will face an agonizing wait for pages to load, or see their browser sessions timeout entirely.

When you use the default behaviour of send_file to send the file out to the user, your rails process will read through the entire file, copying the contents of the file to the output stream as it goes. For small files like images this probably isn’t that big of a deal, but for something enormous like a 200M zip file, using send_file will tie up a process for a long time. Users on slow connections will soak up a rails process for correspondingly longer.

If you get a large number of downloads running, you may find all your rails processes taken up by downloaders, with none left to serve any other users. For all intents and purposes your site is down: you’ve instituted a denial of service attack against yourself.

What about threads?

Unfortunately threads in ruby won’t save us. The combination of blocking IO and green threads mean that even though you’re doing the work in a thread, it’s blocking the entire process most of the time anyway. JRuby users may get a performance improvement, but it’s still going to be a noticeable consumption of resources when compared to letting a web server stream the file.

Don’t believe everything you read on the internet, threads and ruby just won’t help you with most of this stuff.

So What’s the Solution?

Thankfully this problem was solved a long time ago by the guys at live journal. They used perl instead of ruby, but had the same problems. Downloading files would block their application processes for too long, and cause other users to have to wait. Their solution was elegant and simple. Instead of making the application processes stream the file to the user, they simply tell the webserver what file to send, and let the web server bother with the details of streaming the file out to the client.

Their particular solution is quite cumbersome to set up and use, but there’s a very similar solution available called X-Sendfile. It’s supported out of the box with later versions of lighttpd, and available as a module for apache.

The way it works is instead of sending the file to our users, our rails application will simply check they’re allowed to download it (using our login_required filter) then write the name of the file into a special response header then render an empty response. Once apache sees that response it will read the file from disk and stream it out to the user. So your headers will look something like:

X-Sendfile: /home/railsway/downloads/huge.zip

The apache module has a slightly annoying default setting that prevents it from sending files outside the public folder, so you’ll need to add the following configuration option:

XSendFileAllowAbove on

Thankfully for rails users x-sendfile support is built right in to rails, allowing us to make a few minor changes and we’re done.

  
  before_filter :login_required
  def download
    send_file '/home/railsway/downloads/huge.zip', :type=>"application/zip", :x_sendfile=>true
  end

With that, we’re done. Our rails process just make a quick authorization check and render a short response, and apache uses its own optimised file streaming code to send the file down to our users. Meanwhile, the rails process is free to go on to the next request.

Nginx users can use a similar header called X-AccelRedirect. This is a little more fiddly to set up, and requires your application to write a special internal URL to the http response rather than the full path, but in terms of scalability and resource contention, it’s just as great. There’s an intro to the nginx module available if you’re an nginx user. If only uploads were this easy!

Up Next

The next article in the series will cover my experiences when dealing with the storage of your files. Should you use S3? What about blobs, NFS, GFS or MogileFS?


more »

Switch to Passenger (mod_rails) in development on OSX in less than 7 minutes or your money back! »

Created at: 12.02.2009 00:52, source: Robby on Rails, tagged: Ruby on Rails programming development planetargon boxcar rubyonrails rails passenger osx apache

We recently switched our default builds of Rails Boxcar to leverage the benefits of using Passenger (mod_rails) for deployment of your Ruby on Rails applications and it’s been working out great for our customers. Several of our customers and colleagues mentioned that they also began using Passenger in development, which was intriguing.

But… Mongrel has been working great for us for the past few years. Why switch?

It’s true, I’ve been happily using mongrel since it came out as a replacement to webrick back in early 2006, which makes it about 28 in dog years.

Nigel and I
Nigel and I.. 2 1/2 years ago back when Mongrel was just a puppy

But… over the next few weeks, I’m going to evaluate Passenger in my development workflow. There’s no better way to try something then to jump head first. So… here goes.

locke
this guy was a passenger…and I recently started to watch the show

Our team will be evaluating Passenger in our development work flow with a forthcoming blog post but if you want to get your feet wet right away, here are some instructions for setting up Passenger on OSX with PrefPane, which were inspired by Manfred’s posts.

Installing Passenger via RubyGems

To install Passenger on your OSX machine, just run the following with root credentials.

sudo gem install passenger

This will install the passenger gem on your machine. Now we need to go ahead and run a script that is provided with this gem (also with root credentials).

sudo passenger-install-apache2-module

You’ll want to follow the instructions that appear. When you see something similar to the following output from the command, you’ll want to copy/paste that into an apache configuration file. I just created a file at /etc/apache2/other/passenger.conf.

Edit this file with your editor of choice

mate /etc/apache2/other/passenger.conf

Mine looks like:


  #/etc/apache2/other/passenger.conf

  # Passenger modules and configuration
  LoadModule passenger_module /opt/local/lib/ruby/gems/1.8/gems/passenger-2.0.6/ext/apache2/mod_passenger.so
  PassengerRoot /opt/local/lib/ruby/gems/1.8/gems/passenger-2.0.6
  PassengerRuby /opt/local/bin/ruby

  # Set the default environment to development
  RailsEnv development

  # Which directory do you want Apache to be able to look into for projects?
  <Directory "/Users/robbyrussell/Projects/development">
      Order allow,deny
      Allow from all
  </Directory>

Once you finish running through sudo passenger-install-apache2-module, you’ll need to restart Apache on your workstation. This can be done by simply turning off/on Web Sharing in your Sharing Preference Pane.

Sharing

Alright, we got through the hard part. Now, in order for you to begin using Passenger, we need to setup Apache to point to your individual Ruby on Rails application(s). You can hack on Apache configuration files more, but there is an easier way thanks to the Passenger Preference Pane.

This will manage your VHost files for you!

Setting up Preference Pane

If you followed my post on installing Ruby on Rails via MacPorts, you’re going to need to install Ruby Cocoa, which can be done with the following. If you’re using the Ruby provided from Apple, you can skip this step.

sudo port install rb-cocoa

Once that is done, go ahead and move on and download Passenger Preference Pane. Once downloaded, you can install the preference pane, by double-clicking on the following file.

PassengerPane-1.2

The next part is really simple as well. Just begin to add your various Ruby on Rails projects into the Preference Pane… and when you’re done, you should be able to run your applications over port 80 without any problems.

As you can see, I’ve already setup a handful of projects and we don’t have to start/stop mongrels for each one or worry about port numbers when running multiple projects. (time savings!)

Passenger

Voila. Simple enough. You might need to stop/start Apache, couldn’t remember if I needed to or not.

For each host that you add into this panel, it’ll automatically be added so that you can immediately browse to http://yourhost.local and it should just work. :-)

Things to still figure out…

Debugging. If you’re used to doing --debugger, it appears that you can do something similar with the socket-debugger plugin. Not tried it myself, but worth looking into.

Browser testing via VMWare/Parallels/VirtualBox. Does anybody have any tips on how to best appraoch this? Our designers are curious…

As I mentioned, this is day one of trying it out and managed to motivate our entire design and development team to try it with me so that we can all learn about issues together and find solutions quicker. If you’ve been using this approach for a while, I’d be interested in hearing your story and if there are any issues that we should be aware of.


more »