Recently I’ve been working with one of our business units on content tracking. They’ve been trying to track how our site is used, and how popular certain features are. They had started rolling out an appliance that literally sniffed the traffic, and tracked the results. This is okay to a point, but leaves a lot of hard work tracking how the users are using the system. This is where Google Urchin comes in…
A few weeks before all this came up, I was reading up on DevCentral, an F5 community driven site, and stumbled across a post called “Automated Gomez Performance Monitoring”. It sat in my brain as an idea I’d like to try out, maybe deploy Google Analytics on the production site for some testing. It wasn’t too long before it was needed.
So this is what we ended up with…
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
So what is this? And what is it doing? This is an iRule, a cut down TCL rule processing language on the F5 load balancers. There are three triggers, or events, that are applied here. The first is when an HTTP request is made (the initial client request), the second is when the server sends back HTTP data, and the last is when a filter in the second is matched. The important stuff is in the HTTP_RESPONSE section. Because we want it to only apply to successful html pages1, we then set a variable with the Urchin code we need to use. Now for the important bit, the STREAM::expression code. This is basically a regular expression, and in my case, I’m looking for the word , as this appears at the end of the page, and replaces that with the Urchin code, and a new tag. The STREAM_MATCHED code kicks in when the processor manages to get a match, and disables the stream engine. This is so that we only do one replacement, just in case we have multiple tags in the content.
This is all great, but there are some caveats. The stream searches will not work on compressed content. It looks like the author of the Gomez injection rules saw this as in the last edition here, he explicitly removes the header from the request telling the server side it supports compression (Accept-Encoding). This seems to impact data going back through the load balancer as well, and stops the F5 compressing the content using profiles. We handled this by disabling compression support on the servers (in our case IIS).
The last caveat I can remember, you must have a stream profile enabled on your virtual server. You won’t be able to apply this iRule to your virtual server without it, even if it is the generic stream profile.
So in the end, we created a new HTTP profile which did content compression. This was required as we’d removed it from the server side. The new HTTP profile was assigned to the F5, a stream profile was assigned, and so was the iRule. This quickly started dumping data to the Urchin server when we finished the rollout.
Now the business unit is happy, as we turned what could potentially be a 6 week project into a 15 minute fix up, calling into play some of the more power parts of the F5 load balancers which we have yet to use in this part of the application.
It’s worth reading the entire DevCentral Gomez injection series, Joe Pruitt does an excellent job explaining the rules, how it works, then expanding upon the basic project to track more detailed information.