Fake Filenames for Far Future Expires Headers

Everyone knows that one of the best ways to increase page performance is to reduce the number of HTTP requests required for related assets.  There are a pile of ways to approach this (JS/CSS aggregation, image sprites, caching), but the best (and simplest) is client-side caching.  It doesn't help your first-time visitors at all (which is why you still need to look at other optimizations), but for repeat visitors and people who stick around for more than one page caching is a huge win.

The key to all of this is the Expires header: you want to set an expiration way in future so the client (the browser) will cache the asset forever.  With Apache it's really simple:

LoadModule expires_module modules/mod_expires.so
ExpiresByType   application/x-javascript    "now plus 10 year"

That tells Apache to set an expiration on all JS files that is 10 years in the future.  In web time, that's pretty much forever.  Really simple, really effective.  You also want to use mod_headers to set Cache-Control and Pragma headers and remove the ETag and Last-Modified headers, as well as ensure we don't use file ETags:

LoadModule headers_module modules/mod_headers.so
Header  set     Cache-Control    "public"
Header  set     Pragma           ""
Header  unset   Last-Modified
Header  unset   ETag
FileETag        None

The problem is that if you ever change one of those JS files, clients who already have the old version will never know about it.  The solution is to never modify files, only create and delete files.  So if you have 'script.js' and you need to make an update, instead of pushing a new version of that file, you'd instead push 'script_2.js' (or whatever).  That way you're guaranteed that every client will download it afresh (with the long Expires header) because no one has ever seen the file before.  Next time you need to make changes, you'd push 'script_3.js'.

This quickly becomes a bit of a problem, because not only do you have to change the filename, you have to change all the references to the filename as well.  So a little JS tweak suddenly becomes a change to all your SCRIPT tags and republishing all your content.  Not too fun.  This is the problem I can help solve, and it'll be using our friend mod_rewrite (of course!).

Check this innocuous little condition/rule:

RewriteCond    %{REQUEST_FILENAME}    !-s
RewriteRule    (.*)_[0-9]+\.(js|css)$    $1.$2

That says any time you find a JS file that ends with an underscore followed by one or more digits, if it doesn't exist, remove the underscore and digits.  I.e. when 'script_2.js' gets requested, if that file doesn't exist, just serve back 'script.js' instead.  Now that's handy, because now we can use an arbitrary version number and they all hit the same file.  This is not a perfect solution, but it is ideal for the vast majority of cases, since you can just modify 'script.js' in place without a care in the world, and then reference your incrementing scripts to ensure cache refreshes.  Because HTTP caching operates on HTTP URIs, the fact that requests for 'script.js' and 'script_2.js' both hit the same file on disk is irrelevant; they're separate URIs, so they'll be cached separately.

That's not a solution in and of itself, however, because we still have to update all the SCRIPT tags to use a new filename (even though it'll end up hitting the same file on disk).  But now that the URI is divorced from the file itself, we don't have to keep anything in sync.

The last piece is to set up a global variable to use in your script suffixes:

<cfset application.scriptVersion = 2 />
...
<script type="text/javascript" src="/path/to/script_#application.scriptVersion#.js"></script>

Then any time you increment that scriptVersion variable, all your JS will suddenly become uncached and everyone will refresh.  So you can just hack away on script.js until you're happy, bump the variable up, and you're done.  No new files, no changing SCRIPT tags, super simple.

6 responses to “Fake Filenames for Far Future Expires Headers”

  1. Chris Blackwell

    The rewrite rule isn't quite right, there's no second group to be referenced as $2
    Should be

    RewriteRule (.*)_[0-9]+\.js$ $1.js

    What would be really cool is if you had your ant build script set the variable application.scriptVersion with your revision from SVN

  2. Ben Nadel

    One approach that I have used in the past, which is quite similar, is to have the "script version" as a query param rather than as the file name:

    /path/to/script.js?v=#application.scriptVersion#

    I am not 100% sure if this will cache the same same way as a file name, though. As far as the header values, I've only ever played with the expires – I have no idea how to mess with ETags on IIS :)

  3. Ben Nadel

    I can totally understand that. A lot of what I do in my code is based purely on aesthetics (I am not joking).

  4. Andy

    Has anyone come across the following problem

    I have searched high and low and not found anything about this….

    the + in the file name seems to thwart the far futures expires

    http://www.ineaguide.org/files/imagecache/100X100/villa+eugenie.jpg