Matt RaibleMatt Raible is a Web Architecture Consultant specializing in open source frameworks.

10+ YEARS


Over 10 years ago, I wrote my first blog post. Since then, I've authored books, had kids, traveled the world, found Trish and blogged about it all.

Versioning Static Assets with UrlRewriteFilter

A few weeks ago, a co-worker sent me interesting email after talking with the Zoompf CEO at JSConf.

One interesting tip mentioned was how we querystring the version on our scripts and css. Apparently this doesn't always cache the way we expected it would (some proxies will never cache an asset if it has a querystring). The recommendation is to rev the filename itself.

This article explains how we implemented a "cache busting" system in our application with Maven and the UrlRewriteFilter. We originally used querystring in our implementation, but switched to filenames after reading Souders' recommendation. That part was figured out by my esteemed colleague Noah Paci.

Our Requirements

  • Make the URL include a version number for each static asset URL (JS, CSS and SWF) that serves to expire a client's cache of the asset.
  • Insert the version number into the application so the version number can be included in the URL.
  • Use a random version number when in development mode (based on running without a packaged war) so that developers will not need to clear their browser cache when making changes to static resources. The random version number should match the production version number formats which is currently: x.y-SNAPSHOT-revisionNumber
  • When running in production, the version number/cachebust is computed once (when a Filter is initialized). In development, a new cachebust is computed on each request.

In our app, we're using Maven, Spring and JSP, but the latter two don't really matter for the purposes of this discussion.

Implementation Steps
1. First we added the buildnumber-maven-plugin to our project's pom.xml so the build number is calculated from SVN.

<plugin>
    <groupId>org.codehaus.mojo</groupId>
    <artifactId>buildnumber-maven-plugin</artifactId>
    <version>1.0-beta-4</version>
    <executions>
        <execution>
            <phase>validate</phase>
            <goals>
                <goal>create</goal>
            </goals>
        </execution>
    </executions>
    <configuration>
        <doCheck>false</doCheck>
        <doUpdate>false</doUpdate>
        <providerImplementations>
            <svn>javasvn</svn>
        </providerImplementations>
    </configuration>
</plugin>

2. Next we used the maven-war-plugin to add these values to our WAR's MANIFEST.MF file.

<plugin>
    <artifactId>maven-war-plugin</artifactId>
    <version>2.0.2</version>
    <configuration>
        <archive>
            <manifest>
                <addDefaultImplementationEntries>true</addDefaultImplementationEntries>
            </manifest>
            <manifestEntries>
                <Implementation-Version>${project.version}</Implementation-Version>
                <Implementation-Build>${buildNumber}</Implementation-Build>
                <Implementation-Timestamp>${timestamp}</Implementation-Timestamp>
            </manifestEntries>
        </archive>
    </configuration>
</plugin>

3. Then we configured a Filter to read the values from this file on startup. If this file doesn't exist, a default version number of "1.0-SNAPSHOT-{random}" is used. Otherwise, the version is calculated as ${project.version}-${buildNumber}.

private String buildNumber = null;

...
@Override
public void initFilterBean() throws ServletException {
    try {
        InputStream is = 
            servletContext.getResourceAsStream("/META-INF/MANIFEST.MF");
        if (is == null) {
            log.warn("META-INF/MANIFEST.MF not found.");
        } else {
            Manifest mf = new Manifest();
            mf.read(is);
            Attributes atts = mf.getMainAttributes();
            buildNumber = atts.getValue("Implementation-Version") + "-" + atts.getValue("Implementation-Build");
            log.info("Application version set to: " + buildNumber);
        }
     } catch (IOException e) {
        log.error("I/O Exception reading manifest: " + e.getMessage());
     }
}

...

    // If there was a build number defined in the war, then use it for
    // the cache buster. Otherwise, assume we are in development mode 
    // and use a random cache buster so developers don't have to clear 
    // their browswer cache.
    requestVars.put("cachebust", buildNumber != null ? buildNumber : "1.0-SNAPSHOT-" + new Random().nextInt(100000));

4. We then used the "cachebust" variable and appended it to static asset URLs as indicated below.

<c:set var="version" scope="request" 
    value="${requestScope.requestConfig.cachebust}"/>
<c:set var="base" scope="request"
    value="${pageContext.request.contextPath}"/>

<link rel="stylesheet" type="text/css" 
    href="${base}/v/${version}/assets/css/style.css" media="all"/>

<script type="text/javascript" 
    src="${base}/v/${version}/compressed/jq.js"></script>

The injection of /v/[CACHEBUSTINGSTRING]/(assets|compressed) eventually has to map back to the actual asset (that does not include the two first elements of the URI). The application must remove these two elements to map back to the actual asset. To do this, we use the UrlRewriteFilter. The UrlRewriteFilter is used (instead of Apache's mod_rewrite) so when developers run locally (using mvn jetty:run) they don't have to configure Apache.

5. In our application, "/compressed/" is mapped to wro4j's WroFilter. In order to get UrlRewriteFilter and WroFilter to work with this setup, the WroFilter has to accept FORWARD and REQUEST dispatchers.

<filter-mapping>
    <filter-name>rewriteFilter</filter-name>
    <url-pattern>/*</url-pattern>
</filter-mapping>

<filter-mapping>
    <filter-name>WebResourceOptimizer</filter-name>
    <url-pattern>/compressed/*</url-pattern>
    <dispatcher>FORWARD</dispatcher>
    <dispatcher>REQUEST</dispatcher>
</filter-mapping>

Once this was configured, we added the following rules to our urlrewrite.xml to allow rewriting of any assets or compressed resource request back to its "correct" URL.

<rule match-type="regex">
    <from>^/v/[0-9A-Za-z_.\-]+/assets/(.*)$</from>
    <to>/assets/$1</to>
</rule>
<rule match-type="regex">
    <from>^/v/[0-9A-Za-z_.\-]+/compressed/(.*)$</from>
    <to>/compressed/$1</to>
</rule>
<rule>
    <from>/compressed/**</from>
    <to>/compressed/$1</to>
</rule>

Of course, you can also do this in Apache. This is what it might look like in your vhost.d file:

RewriteEngine    on
RewriteLogLevel  0!
RewriteLog       /srv/log/apache22/app_rewrite_log
RewriteRule      ^/v/[.A-Za-z0-9_-]+/assets/(.*) /assets/$1 [PT]
RewriteRule      ^/v/[.A-Za-z0-9_-]+/compressed/(.*) /compressed/$1 [PT]

Whether it's a good idea to implement this in Apache or using the UrlRewriteFilter is up for debate. If we're able to do this with the UrlRewriteFilter, the benefit of doing this at all in Apache is questionable, especially since it creates a duplicate of code.

Posted in Java at Jun 04 2010, 09:27:42 AM MDT 4 Comments
Comments:

Nice post.

Of course, this is the only way to go when the versioning is not an optional requirement, but I am curious if setting of ETag headers wouldn't solve your problem related to asset caching?

I'm not sure what wro4j version are you using, but the latest one (1.2.7 and even earlier ones) updates ETag header based on the last update time of the assets. From my experience, it solves the asset caching problem for all the browsers and doesn't require any extra configurations.

Alex Objelean

Posted by Alex Objelean on June 05, 2010 at 04:20 PM MDT #

It sounds like you should be using JAWR:

https://jawr.dev.java.net/

Does all of this for you; even rewrites your css to include a hash of the file in the path to resources referenced in your css. Plus it will gzip it for you and bundle up multiple css or javascript files into a single file to limit http requests while keeping your source nice and modular. And of course it handles the HTTP caching headers appropriately (though it does set the espires to 10 years hence, which is slightly non-compliant with the HTTP 1.1 spec which says a resource should not expire more than a year into the future).

Posted by Robert Elliot on June 07, 2010 at 01:42 PM MDT #

Hi Matt,

Google's Optimize Caching article has more on the problem and its solution here: http://code.google.com/speed/page-speed/docs/caching.html; they refer to versioning an asset as fingerprinting.

As a result of this, I've used md5 sums of the files themselves as fingerprints to version all static files (css, js, images) of one of my projects.

The md5 sums ended up being an elegant solution as they pick up minor+major changes to files without having to extract or store version numbers anywhere else in your app.

Before fingerprinting, my app was using its framework's default strategy of query string file last modified timestamps. Using timestamps for versioning (as a query string or in the file name) can be a bad idea if you deploy to multiple servers or your version control doesn't maintain the timestamps; Git can't and Subversion needs to be configured to do so.

The md5 sum fingerprints are calculated on-the-fly once per deployment and then cached (unless you're in development or test environments).

(This was for a Rails app but I hope the principle is useful to all app developers. The code is available as the Asset Fingerprint Rails plugin, more information can be found at http://blog.eliotsykes.com/2010/05/06/why-rails-asset-caching-is-broken/)

For Java webapps, I second Robert's above recommendation for JAWR. I used it successfully on a client site a year or so ago. If memory serves, JAWR doesn't version image files, but it is worth considering versioning image files for performance and cache-busting. Instead of md5 sums I think it uses the String#hashCode() of the file's contents as a fingerprint.

Posted by Eliot Sykes on July 10, 2010 at 01:26 AM MDT #

I take care of asset caching issues by inserting what I call a "version directory" in the path:

/myapp/assets/123/css/mystyles.css

The "123" folder above has its 'numeric' name incremented every time something changes in the assets directory.

Thus, for the next build the version-directory becomes "124".

Browsers invariably see this as a different - new - path, so caching issues are avoided.

It is a simple solution if you have a web-application that is frequently redeployed as a WAR file.

Posted by Harry Mantheakis on July 19, 2010 at 01:18 PM MDT #

Post a Comment:
  • HTML Syntax: Allowed