Mixing Apache HTTP Server, mod_rewrite and mod_jk

I'm trying to configure Apache and Tomcat to work with a desired architecture for doing A/B Testing on my current project. Our basic idea is that we'll deploy entirely new WAR files when we have a test, then use the magic of Apache's mod_rewrite, mod_jk and possible the UrlRewriteFilter to keep the URLs somewhat consistent between version A and version B. Here's some questions I have for those folks who might've done this before:

  1. Is it possible to use Apache's mod_rewrite to map http://www.domain.com/?v=1 to http://www.domain.com/1 (this allows us to have two different applications/wars served up to the same domain).
  2. If #1 is possible, what's the RewriteRule for allowing the parameter to be anywhere in the query string, but still allowing the target to use it as the context name?
  3. Is it possible to use something in the WAR (likely the UrlRewriteFilter) to produce HTML that has rewritten links (i.e. http://www.domain.com/?id=1 in the WAR whose context is 1)?

In other words, can Apache forward to the correct app going in, and can that app rewrite its URLs so those same URLs are used when going out?

I believe this is all possible. However, I am having difficulty getting mod_jk to allow mod_rewrite to be processed first. If I have the following in httpd.conf, it seems like htdocs/.htaccess gets bypassed.

JkMount /* loadbalancer

Is it possible to configure Apache/mod_jk so WARs can hang off the root, but still use mod_rewrite? If not, the only solution I can think of is to use UrlRewriteFilter in the WAR to forward to another context when a "v" parameter is in the URL. Currently, the UrlRewriteFilter doesn't allow forwarding to another context. The good news is the Servlet API allows it. I got it working in Tomcat (with crossContext enabled) and wrote a patch for the UrlRewriteFilter.

Anyone out there have experience doing A/B Testing in a Java webapp? If so, did you try to disguise the URLs for the different versions?

Update: I've got a bit of this working. The magic formula seems to be don't try to hang things off the root - use mod_rewrite to make things appear to hang off the root.

First of all, I posted a message similar to this post to the tomcat-user mailing list. Before I did so, I discovered mod_proxy_ajp, which happens to look like the successor to mod_jk. AFAICT, it doesn't allow fine-grained rules (i.e. only serve up *.jsp and *.do from Tomcat), so I'll stick with mod_jk for now.

Rather than proxying all root-level requests to Tomcat, I changed my JkMount to expect all Tomcat applications to have a common prefix. For example, "app".

JkMount /app* loadbalancer

This allows me to create RewriteRules in htdocs/.htaccess to detect the "v" parameter and forward to Tomcat.

RewriteEngine On

RewriteCond     %{QUERY_STRING}     ^v=(.*)$
RewriteRule     ^(.*)$              /app%1/ [L]

This isn't that robust as adding another parameter causes the forward to fail. However, it does successfully forward http://localhost/?v=1 to /app1 on Tomcat and http://localhost/?v=2 to /app2 on Tomcat.

What about when ?v=3 is passed in? There's no /app3 installed on Tomcat, so Tomcat's ROOT application will be hit. Using the UrlRewriteFilter, I installed a root application (which we'll likely need anyway) with the following rule:

    <rule>
        <from>^/app(.*)$</from>
        <to type="forward">/</to>
    </rule>

So I've solved problem #1: Using URL parameters to serve up different web applications. To solve the second issue (webapps should rewrite their URLs to delete their context path), I found two solutions:

  1. Use mod_proxy_html. Sounds reasonable, but requires the use of mod_proxy.
  2. Use the UrlRewriteFilter and outbound-rules.

Since I'm using mod_jk, #2 is the reasonable choice. I added the following link in my /app1/index.jsp:

<a href="<c:url value="/products.jsp"/>">link to products</a>

By default, this gets written out as http://localhost/app1/products.jsp. To change it to http://localhost/products.jsp?v=1, I added the following to urlrewrite.xml:

    <outbound-rule>
        <from>^/app1/(.*)$</from>
        <to>/$1?v=1</to>
    </outbound-rule>

This produces the desired effect, except that when I click on the link, a new session is created every time. AFAICT, I probably need to do something with cookies so the jsessionid cookie is set for the proper path.

Not bad for a day's work. Only 2 questions remain:

  1. What's a more robust RewriteRule that doesn't care about other parameters being passed in?
  2. What do I need to do so new sessions aren't created when using an outbound-rule?

It's entirely possible that mod_proxy_ajp with mod_rewrite_html is best tool for this. Can mod_proxy handle wildcards like JkMount can? I've heard it's faster than mod_jk, so it probably warrants further investigation.

Update 2: I achieved the desired result using mod_rewrite, mod_jk and the UrlRewriteFilter (for outgoing links). Here's what I put in htdocs/.htaccess (on Apache):

RewriteEngine On

# http://domain/?v=1 --> http://domain/app1/?v=1
RewriteCond     %{QUERY_STRING}     v=([^&]+)
RewriteRule     ^(.*)$              /app%1/$1 [L]

# http://domain --> http://domain/app (default ROOT in Tomcat)
RewriteRule     ^$                  /app/ [L]

And in the urlrewrite.xml of each webapp:

   <outbound-rule>
       <from>^/app([0-9])/([A-Za-z0-9]+)\.([A-Za-z0-9]+)$</from>
       <to>/$2.$3?v=$1</to>
   </outbound-rule>

   <outbound-rule>
       <from>^/app([0-9])/([A-Za-z0-9]+)\.([A-Za-z0-9]+)\?(.*)$</from>
       <to>/$2.$3?$4&amp;v=$1</to>
   </outbound-rule>

Next I'll try to see if I can get it all working with mod_proxy_ajp and mod_proxy_html. Anyone know the equivalent of "JkMount /app*" when using LocationMatch with mod_proxy?

Posted in Java at Apr 16 2007, 12:07:10 PM MDT 11 Comments
Comments:

I used to do something very similar, but without using mod_rewrite, only mod_jk2. I can't remember exactly how I did it but looking at my old workers2.properties there's some evidence of using multiple workers:
[channel.socket:localhost:8009]
port=8009
host=localhost

[channel.socket:localhost:8010]
port=8010
host=localhost

[ajp13:localhost:8009]
channel=channel.socket:localhost:8009

[ajp13:localhost:8010]
channel=channel.socket:localhost:8010

[uri:www.domain.com/1/*]
worker=ajp13:localhost:8009

[uri:www.domain.com/2/*]
worker=ajp13:localhost:8010
I'm assuming I then configured tomcat to use multiple JK2 AJP 1.3 Connectors. Either that or I configured multiple tomcat instances, each one with a connector listening on a different port.

Of course then you'll also need in your tomcat config:
<Context path="/1" docBase="app1" debug="0">
...
</Context>

<Context path="/2" docBase="app2" debug="0">
...
</Context>
It's probably an obsolete solution by now, and a rather ugly one, but nothing's as ugly as getting into mod_rewrite ;)

Posted by Daniel Farinha on April 17, 2007 at 04:22 AM MDT #

OMG, how much work and trouble :( !

Wouldn't be everything much simpler if Apache would be removed entirely from this equation? Tomcat alone seems to do it's job very good (especially v6).

Ahmed.

Posted by Ahmed Mohombe on April 17, 2007 at 04:39 AM MDT #

Ahmed - Tomcat does nothing to solve the problems I'm looking to solve. It doesn't do load balancing, and it also doesn't allow to applications to be served up at the same URL. Sure, I can use UrlRewriteFilter to serve up apps at the same URL, but I'd need to handle the logic within my application. This means that if I wanted to install a new WAR for A/B Testing, I'd have to bring down the ROOT webapp to do it. Of course, if I had Apache doing the front-end load-balancing, I wouldn't need to worry about any interruptions. ;-)

Posted by Matt Raible on April 17, 2007 at 09:46 AM MDT #

Matt, I'm interested in the the topic, but I did not fully understandy what problem are you trying to solve. I understand that A/B testing is trying different options on the same website in order to improve clicks, but I do not understand how you can achieve it. Can you explain how deploying two web applications with similar links under the ame domain does relate to A/B testing? Thanks, Giovanni

Posted by Giovanni on April 18, 2007 at 02:58 AM MDT #

Giovanni - what I'm trying to do is deploy completely separate WARs (the A and B tests) and serve them up without changing the context path. In other words, it should appear they're both the same application. For example:

  • http://my.domain.com/ --> WAR A (the default, no testing here)
  • http://mydomain.com/?v=1 --> WAR B
  • http://mydomain.com/?v=2 --> WAR C
  • http://mydomain.com/?v=3 --> WAR A (because their is no WAR 3 deployed, and we need to deactivate tests)
For each application, rather than having them produce outbound links with my.domain.com/context1/foo.html, they produce my.domain.com/foo.html?v=1. This makes it (somewhat) transparent to the end user. In addition to parameter-based rewriting, I've also got cookie-based rewriting working. The last thing I need to do is figure out how to route to 1, 2 and 3 when no cookie is present.

Last night, I spent about an hour trying to get mod_proxy to support wildcards (i.e. "/app([0-9])"), but had no luck. I'm sticking with mod_jk for now since it works "good enough."

Posted by Matt Raible on April 18, 2007 at 09:40 AM MDT #

Thanks Matt now I have a better understanding of issue. I think I'm still missing how do you conduce A/B test. Do you enable the three contexts at the same time and load balance between them? If so how do you evalutate the testing results?

Posted by Giovanni on April 19, 2007 at 01:39 AM MDT #

> Do you enable the three contexts at the same time and load balance between them?
Yep, this is the general idea. I haven't figured how to do the load balancing yet, but hopefully it won't be too hard. Most load balancing stuff I've seen balances between servers, not applications. I've gotten some ideas from the Apache mailing list so I'll be trying those out today.

> If so how do you evaluate the testing results?

Using a different tracking number for each app. The analytics tracking tool should show if one app is achieving better results than the others.

Posted by Matt Raible on April 19, 2007 at 08:36 AM MDT #

To be honest, I don't fully understand what you're trying to do, but I got here looking for a way to mix mod_proxy and mod_rewrite, so maybe this will help someone else with the same problem:

In short, the trick is adding the [P] directive to the rewrite rule. This allows the rewritten URL to go to mod_proxy.


Specifically, what I needed was for all URLs with '.servlet' in them to go to Tomcat, and leave the rest as static data on the httpd. This worked:

RewriteEngine On
RewriteRule	^/(\w+\.servlet.*)$ /myapp/$1 [P]

ProxyPass         /myapp  http://localhost:8081/
ProxyPassReverse  /myapp  http://localhost:8081/
(You need to configure a connector in Tomcat as well, as explained here.)

Posted by itsadok on May 13, 2007 at 02:11 AM MDT #

OK, it's even simpler than I thought. mod_rewrite does the proxying itself. So all you need is:
RewriteEngine On
RewriteRule	^/(\w+\.servlet.*)$ http://localhost:8081/$1 [P]

Posted by itsadok on May 17, 2007 at 05:01 AM MDT #

Hi Matt, I'm trying to combine mod_jk with URL rewriting at the moment, situation: I want the context path out of the URL, so to say I want the user to go to http://foohost.net/ instead of http://foohost.net/roller/xyz/. I attempted to do a JKmount /roller/* ajpwhatever and then a URL rewrite from / to /roller/xyz/$1, but as soon as the URL has been rewritten, mod_jk does not kick in any more. Apache tries to serve content from /path_to_htdocs/roller/xyz/ which fails with a 404. Know the problem and maybe a potential solution? Cheers, Robert

Posted by Robert on July 29, 2007 at 12:35 PM MDT #

Hi! I have a problem with UrlRewriteFilter... when i use it, it generates new sessions... when i use rewrited urls.... even if i not use outbound rules... (if i put the url in the explorer, also gets a new session for this link) do you have a solution of this!? :O

Posted by Cristian on October 08, 2007 at 02:49 PM MDT #

Post a Comment:
  • HTML Syntax: Allowed