Wednesday, January 28, 2009

WASP case study: WebTrends tag audit

While doing extensive tests with WASP on a site using WebTrends, I have identified a tagging issue. The site will remain unnamed to protect the innocents, but I thought that would make a pretty good case for WASP. Note that although WebTrends is shown in this example, the type of issue describe here is applicable to almost all vendors.

In this (long) post, I will walk you through a real case example of tag auditing with WASP. Some aspects are quite technical, but tag implementation, and furthermore the quality assurance of those tags, is often neglected. Yet, tag quality as a direct correlation with your ability to provide insight and business recommendations!

Even if you are not a technical person, read on, I'll hold your hand along the way :)

The WebTrends tag

Most vendors rely on the concept of page tags, a couple of lines of JavaScript embeded in each page of your site. A typical page tag for WebTrends might look like the code below, taken straight from their Tag Builder tool:
|!-- START OF SmartSource Data Collector TAG --|
|!-- Copyright (c) 1996-2009 WebTrends Inc.  All rights reserved. --|
|!-- Version: 8.6.0 --|
|!-- Tag Builder Version: 2.1.0  --|
|!-- Created: 1/28/2009 20:43:18 --|
|script src="webtrends.js" type="text/javascript"||/script|
|!-- ----------------------------------------------------------------------------------- --|
|!-- Warning: The two script blocks below must remain inline. Moving them to an external --|
|!-- JavaScript include file can cause serious problems with cross-domain tracking.      --|
|!-- ----------------------------------------------------------------------------------- --|
|script type="text/javascript"|
var _tag=new WebTrends();
|script type="text/javascript"|
// Add custom parameters here.
|div||img alt="DCSIMG" id="DCSIMG" width="1" height="1" src=""/||/div|
|!-- END OF SmartSource Data Collector TAG --|

Debugging: the traditional way

Although I receive ton of positive feedback, I also got some skeptics people arguing that using "debuggers" (for example, Firebug) and "proxy debuggers" (for example, Charles Proxy), or even simple "web bug" checkers or parsers that looks at the page source code to see if the tag strings are there is sufficient.

They simply can't see the value WASP is providing!

Note: I'm mentioning Firebug and Charles Proxy because they are good tools and they have their place in the web analyst/web implementation specialists arsenal.

The WebTrends data collection URL

With other tools you would get an indication the WebTrends JavaScript include file is there (typically, webtrends.js, as shown in the page tag abstract above), or a debugger would show you something like this:

The challenge

Without looking at the solution below, if I told you that:
  1. The javascript tags are fine
  2. The tags are firing since I can see the URL flying by in a proxy debugger
  3. Data is being collected since I can view my reports online
Would you be able to identify the problem?

All elements of the solution are shown above in the page tag and the collecting URL, and although I can't show you the WebTrends report, I can assure you I'm getting data in my reports.

The solution, in short

Some pages use titles with the ampersand (&) character (yes, as shown in the obscure tag URL shown above!). The page title is automatically passed to WebTrends via the "WT.ti" parameter. The problem is this character should be encoded/escaped. Otherwise it will break the URL terminology WebTrends is expecting and likely result in shortened page titles report or even plain rejection of this data. You would likely miss some data, but not all, since some pages titles do not use the & character! Thus, you still gets reports with some data...

WASP for Analyst sidebar view

In the WASP sidebar view shown right (click for larger view), we see the Title (WT.ti) tag, and we see an odd tag further down. Although it's doesn't seem obvious to the non exercised eye, this is a clear indication something is wrong since each row should be composed of a value pair with the tag on the left, and its value on the right.

WASP Pro crawler

The sidebar view is excellent for seeing the tags in the context of your browsing session. But for deeper analysis of site tags, using WASP Pro crawler feature is a must. It will start from a given location, typically your home page, and visit each links of your site and gather the detailed information about the tags.

A snapshot of the built-in Data Browser is shown bellow. I have removed most of the information and kept only a few columns, but you can see some HTML Titles and how they have been populated to the WT.ti variable. While a couple of them are fine fine since the "&" character was correctly escaped in the title, some others are cut off.

The data browser is specifically built to make it easier to view all tags, sort or filter their values. This makes it very easy for quick perusing of the crawl result. Of course, you can also export the results to a CSV file to play with the data in Excel, or even to XML if you ever need to integrate the crawl results into another system.

Abstract from Webtrends documentation

Future releases of WASP will include data validation rules so each value sent will be checked against acceptable characters and length rules. In order to do so, I need the vendors to provide detailed information about the acceptable values for their tags, as shown in the WebTrends installation guide abstract bellow:
URL Encoding
Certain characters can cause problems when used in query parameter values. For example, for a WebTrends query parameter assignment of WT.ti="The Gettysburg Address"; SDC writes the following value to the log file:

&WT.ti=The Gettysburg Address

The space characters in this value cause problems because the space character is used to separate fields within a log file. The solution is to URL encode all query parameter values. URL encoding means replacing certain characters with their hexadecimal equivalents of the form %XX where % is the escaping character and XX is the character’s numeric ASCII value. URL encoded characters are properly rendered in WebTrends reports.

Continuing with this example, the URL-encoded form is as follows:


Note that space characters have been replaced by %20.

The tag URL encodes the following characters: tab, space, #, &, +, ?, ", \, and non-breaking spaces. These characters are defined in the regular expression list. The regular expression list contains regular expressions to search for, and the corresponding %XX replacement strings. Regular expression properties are used as arguments to the string.replace method. The tag URL encodes parameter values by passing them as arguments into the dcsEscape function.


There is absolutely no way other methods would have helped you spot this kind of issue. I don't have to convince anyone that good analytics start with good data. You could spend countless hours, even days, trying to find a problem like this one (and many others of the same type!), while you could be spending time doing useful analysis and providing insight & recommendations to improve your business.

Go ahead, give a try at the free version of WASP, or get the WASP for Analyst or WASP Pro licences. Visit now!