About bad bots, web anaytics and WASP

Simple fact: bad bots can screw up your web analytics data.
Interesting post from Marshall Sponder this morning, himself referencing Jay Harper on SEOmoz and Judah Phillips. So I'll had my two cents from the "bot" side of things, specifically regarding the Web Analytics Solution Profiler (WASP).

No surprise

From a server-side perspective, IT has known since the early days of Yahoo! that crawlers would affect web server logs. If you want some historical tidbits fun, look at those early posts:
"I count the accesses to my page to see if it's being used . Similarly, I browse through the access logs to see _who_'s using the page" then someone replying "This ablity to know *exactly* what someone looks like is going to be very very sigificant down the road." (March 1995: Visitor counts?)
"Yahoo! has links to well over 40,000 different web pages, and over 300,000 people use Yahoo! every day."(August 1995: The *NEW* Yahoo!)
There is very little we can do about bad behavioring bots, executing JavaScript or not, but as long as it's not widespread (and even it it does) or you are not the deliberate target of an attack, that shouldn't render your web analytics data useless. Malicious bots will not care about robots.txt exclusion rules, or slowing down your servers or screwing up your data. The only difference is that more sophisticated bots (crawlers) or session recorders/playbacks can run JavaScript and simulate real user sessions.

This is not new, it just seems marketing took a long time to find out :)

The web analyst role

Data cleansing and validation is an essential activity of the web analyst job and whenever there's a spike in traffic you should be able to explain it. As Avinash said a while back: "data quality sucks, lets get over it".

Depending on your experience and skills, and of course the web analytics solution you are using, it might be fairly easy to identify misbehaving visitors and spot outliers. The next step is to segment the data to exclude what shouldn't be there. Not "delete", "exclude". I recently saw a post suggesting to create Google Analytics filters to completely get rid of non-US traffic for a US-centric website. Don't do that... you still want to know where your unqualified traffic is coming from! Be it outside your geographic market, bad keywords or referrers or anything else, what you think is "unqualified traffic" can still help you optimize your site and even discover new opportunities. Segment, segment, segment...

Now about WASP

From the other side of the coin, ethic and professionalism (and I guess knowledge of how the Web works and experience too) plays a big role in how a crawler will behave. For example, the crawling feature of WASP runs JavaScript and could screw up your web analytics data pretty badly. That's why the current version is limited to crawling only 100 pages. But a test was done for a website with 30,000 pages without any glitch and it is expected WASP will easily handle crawling sites of over 100,000 pages in a single run.

The upcoming version of WASP will include the following options:
  • Abide by the robots.txt rules for excluding areas of your site and reducing the load
  • Modify the user agent string to identify itself as a bot
  • Show your real IP address before crawling so you can filter that data
  • "Stealth mode", effectively blocking the web analytics request altogether.
Any other ideas?

Performance analytics the Coradiant way

What has now been recently rechristened "performance analytics" has been in existence for decades in IT: managing availability, capacity and performance. I've been a system administrator, dba, developer of real-time applications for the Montreal stock exchange and involved with the Web since the first incarnation of Mosaic, so I was comfortable meeting Coradiant at their R&D center in Montreal to discuss about the new TrueSight WA integration with Omniture SiteCatalyst (through the Genesis plug & play architecture).

A case for multiplicity

By now it's a given that web analytics is not only about visits and page views. The concept of "multiplicity" put forth by Avinash Kaushik is not yet tapped by most organizations. Put simply, here's how Coradiant TrueSight WA can help:
"captured user information is typically used by the IT department for troubleshooting, service level reporting and change management. But with TrueSight WA, web traffic, performance, and availability metrics gathered from each user visit are seamlessly integrated into Omniture SiteCatalyst, so marketers can get entirely new insights into campaign and customer success."
Coradiant integrates to your infrastructure to collect performance information directly, without additional impact on the client side (no JavaScript tags to change).

Correlation IS causation

Especially in the field of web analytics, how many times have we heard "correlation does not not imply causation"? It's often a beginner's mistake to take two distinct metrics and come up with a misled explanation of why one would impact the other. When we're talking about conversion rates and performance, there IS a very strong correlation between the two. Put it anyway you want, I can guarantee you that non-availability will lead directly to zero conversion. I can also guarantee you that poor performance will lead to lower conversion.

So why do most web analytics solutions keep offering only non-performance related statistics? I guess it's in part caused by the great divide between IT and marketing. Now, thanks to Coradiant, that gap will be a little less cumbersome, bringing more context and a similar language around IT and marketing (at least when it comes to web analytics!).

An example

Let's make the case for performance analytics with what a typical empowered web analyst would say about a campaign (fictitious case):
"Our marketing campaign increased traffic to the site by a factor of 4 times, bringing 120,000 visits to the site on a single day instead of the usual 20,000 or so. Although the conversion rate was lower (2% instead of 4%), the campaign was a resounding success, bringing $X in increased revenues. In our next campaign we will try to increase conversion rate by doing better ad placement, but still, bringing more traffic also increased our brand awareness and engagement toward the site."
With Coradiant, here's the additional insight gained:
"The campaign was a success from a marketing perspective, but since 10% of the visits resulted in errors and about 15% were slower than usual, at the 4% conversion rate, we left $Y on the table and we probably have frustrated a fair number of our visitors. Next time I would recommend spreading out the campaign or increasing the infrastructure capacity beforehand."
Unless you were very, very, very friendly with IT, you would have never known that. Even then, it would be a lot more difficult and time consuming to come up with such a clear conclusion.

Kampyle: combining VOC and Support

From time to time I test new services. One such service is called Kampyle "feedback analytics service" and I've put it to the test on to get feedback about the site and the tool.

Similar but different

Kampyle is somewhat similar to other Voice Of Customer services such as iPerceptions or ForeseeResults: you include a snippet of code on your page and set the ratio of visitors that should get surveyed. Pretty simple.

The survey fires when the user moves the mouse out of the window space (and makes sure you don't get bugged down with it every time) and simply ask for a rating of the site and optional details. That's where it starts to be different.

It's a conversation

Kampyle move away from other VOC services at this point. The additional categorized feedback (Bug report, compliment, suggestion, etc.) really helps learn more about the reason for a positive or negative feedback. And it works pretty well!

Another incentive to get feedback is a little triangular icon shown at the bottom of web pages, simply stating "feedback us". This way, you don't only get the survey invite, you can willingly provide your feedback whenever you want. Of course, you can look at feedback globally or for specific pages on your site.

Out of the 40 feedback I got in about two weeks, about half of them includes more details that I could really act on (typos on the site, feature requests, bugs, etc.).


The management interface is nice and efficient. Grouped under Feedback Analytics, User Analytics and Feedback Inbox, the interface shows relevant stats and appropriate graphs.

  • Feedback Analytics works fine, is very simple and logical (shown above)
  • User Analytics is a bit less obvious. The goal is to get people to register to Kampyle and provide some demographics (gender, age, location, etc.). From a site owner perspective, this would be great information, but from a user perspective it's not obvious to me that people would be willing to provide such detailed information just to provide feedback. As I tell my students: always make sure to serve the primary purpose. In this case: get their feedback.
  • The Feedback Inbox is where you get a categorized view of the incoming feedback, similar to a mail inbox. The top pane shows comments grouped by type and ranking (think of tags) and the bottom pane shows the matching comments. Where I get confused is the handling of the feedback status and how the "Publish completed" works at the top level. I would expect to "reply" to a specific comment rather than "publish" a whole category (but maybe I'm missing something). Still, the fact you can get back to your client (given they left their email address, of course) is very interesting.

My take

Of course, Kampyle "eat their own dog food" and use Kampyle to get feedback about Kampyle beta... The service is very interesting and looks promising. The User Analytics doesn't look very obvious to me but could be valuable, we also have some good technographics (browser, OS, IP location) in the Inbox. I need to get a better handle on the way this Inbox works but this is really where Kampyle differentiate itself from other solutions. Kampyle's traditional Voice Of Customer surveying and similarities to a Trouble Ticket support system makes it very interesting.

[WASP] v0.45 released & important change

IMPORTANT: You need to re-install WASP from

Important change!

WASP will now be distributed from the official WASP website rather than It was taking too long to get approved and since I want to accelerate the development cycle, it will be much simpler to distribute directly from my site.

Bug fixes

  • Latest Google Analytics ga.js code
  • Fixed a problem with the "File Open" domain limitation
  • Minor cosmetic enhancement
  • Improved the crawling feature (it was actually running too fast! I had to slow it down!)
  • Couple of new tools detected (see the full list of tools detected by WASP)
  • Check the complete list of current and upcoming features


Some facts about WASP:
  • Now detects 101 different tools
  • The active user base is now close to 8,000 practitioners, consultants, web analysts and implementation specialists
  • 750,000 websites and 27,000,000 pages analyzed since January 1st
  • In a pilot run, WASP was used to scan a large web site of over 25,000 pages
  • WASP was used for a market research of 3,000 websites

Getting it or upgrading

Two ways to upgrade or get WASP:
  1. New users: Visit and click on the "Add to Firefox" green button
  2. Upgrade: If you already have WASP, YOU NEED TO REINSTALL from

Playing with my face: the influence of profile pictures

We have profile pictures everywhere: Flickr, Facebook, Google, Amazon, name it... also in my tutor profile for the classes I'm tutoring as part of the UBC Award of Achievement in web analytics.

I thought it would be fun to show how pictures can influence behavior. Thanks to my friend Joseph Carrabis who opened up my eyes to those details and to Gail Lipschitz, an instructor for another class I'm tutoring at UBC: Introduction to Business Process Analysis. Gail is amazingly good at "reading between the lines" of students posts and finding out about their personality just from a profile picture!

Let's play with my face

... and tell me what you think (honestly!)

Those snapshots were taken from the forum interface in the Web Analytics for Site Optimization class.

Profile #1: formal/younger

Other than the fact I was younger (and thinner), I'm looking straight at you. The formal background might make me look a bit more professional but also could represent a top-down hierarchy: "I'm the tutor/you're the student".

Profile #2: formal/older (more weight)

This picture was taken more recently (older - notice the gray hair... and more weight!). Still very formal, still looking straight at you and zoomed closer. Might be even more intimidating because I look older (I must have more experience then you!) but at least I'm smiling...

Profile #3: casual, looking away

The third picture comes from a conference I did recently (btw, notice the title of my presentation...). No tie, more "in situ" picture (in fact, it's a cropped section of the larger picture at the top of this post). Do I look more "accessible" and friendly then the previous pictures? But why am I looking away from you (or from the content)?

Profile #4: casual, looking at

The last one, which I will keep at least for this profile, is reverted so I look "at" the content, effectively bringing your attention to it.

Cool isn't it? Do you think this simple change might have a positive influence on this forum participation and even the overall class perception? (me think so!)

Analytics tools like strawberries and oranges

There is a fascinating discussion going on in the Yahoo! web analytics forum.

I haven't seen such a heated discussion in a very long time, especially in a forum about web analytics. Aren't we passionate or what!? :)

The whole thing started with a comment that GA data could be lost or delayed and how this would impact analysis, reporting to upper management and the resulting decision making. Then the debate heated up with comments about GA being good or not, benefits of free tools vs paid, etc.

My take

Google Analytics (or other low-end solutions, some of them very good) and high-end tools are like comparing strawberries and oranges... still fruits and you can still get pretty good juice out of them, but not quite the same. Some will even mix both juices to make pretty good cocktails... In fact, around 10% of the sites use both GA and another solution.

And somehow, sometimes for unknown reasons, people are allergic to strawberries...

Are you of legal age?

If you start by looking at your web analytics maturity level, and consciously decide where you want to be on that scale, you will pretty easily find out if GA is a good fit or a high-end tool such as Omniture would be better.

I'm referring to Gartner's web analytics maturity model presented by Bill Grassman at eMetrics San Francisco last year:

Google Analytics is a level 1, most of level-2 and some of level-3 type of tool. Omniture will cover you up to most of level 4 if you use Test & Target, Discover, Genesis and imports/exports. Level 5 is serious BI that go beyond web analytics (think Davenport's "Competing on analytics").

GA, Coremetrics, Omniture and a bunch of others each have very unique characteristics that makes them better fits in some situations. The market research I'm doing clearly shows that some tools are better suited at some types of businesses and different levels of web analytics maturity.

Google Analytics has significantly increased the awareness about web analytics, which is a very good thing. The problem we now face is setting the expectations for companies aiming at the higher levels of the scale with a tool that is appropriate for some of it, but not all. As a consultant, it is my role and my responsibility to recommend the best solution for each situation. In some cases, it might be more profesional to let go a client that aim for the sky but simply won't understand the differences and limitations between the tools because all they see is the "free" aspect of Google Analytics...

An orange juice in the morning is fine, a Pina Colada in the afternoon has a little more power!

The gift of trust from Avinash Kaushik

Going trough my usual round of blog reading this morning, I noticed a new post from Avinash Kaushik. Must be good: Avinash is a well respected and trusted source. In two years he has become one of the (I would dare to say THE) greatest influencer trough his very high quality blog posts, his book, speaking and now being the official web analytics evangelist for Google Analytics. The personality plays for a lot: cheerful, always willing to give, very attentive to his interlocutor regardless if its face to face or millions of pixels away.
Trust and influence is something people give you, not something you claim.

You could become an influencer, looks like I'm one now!

What happen when someone like Avinash says this:

A email Stephane wrote to me made me realize how fantastic blogs are at creating “influencers”. He described how at the eMetrics Insights Day he was invited to present industry insights on a panel along with Jupiter and Nielsen.

Pause and think about it for a second.

Two big established companies with budgets of millions and years in the “business”. And one, like me, “small” blogger. And he has the power and the authority as a result of his blog (and WASP ).

Now to be honest Stephane is brilliant and get’s invited to do this all the time. But even someone like me gets invited all the time to “analyst briefings” (sadly I decline most of them) and meeting with CEO’s and yes even gets sent nice gifts. :) Trimmings that in the past were reserved for the elite few.

For the longest time the loud voices belonged to the “experts” and “analysts”. Forrester and Jupiter and Gartner and others had a hold on the “influencing” market. They continue to have a voice, but it is no longer the voice.

Through your blog you have the power to be a “influence powerhouse”, provide an authentic voice of someone who actually knows, and provide a valuable service to the world.

The ability to influence others is now a lot more democratic. Next up on stage, Stephane, Nielsen, Forrester and You!

And that is a good thing.

I fell off my chair... When I woke up I felt good, honored, happy. Thank you Avinash!

"Where did I want to be today?"

A twist on the slogan "Where do you want to go today"? My career spans 20 years, most of which was in IT, the past 15 or so dedicated to the Web. Every now and then it was "review time", depending on the culture of the company and the quality of the boss it was not always fun, not always very constructive...

Some gems:
  • "Your grades are not good enough, don't even try going in IT" (a high school teacher trying to help us find our way in life...). I'm now twice on the MBA honor roll...
  • "You can't understand this, you are an IT guy". A marketing manager when I recommended changes to a site. This was the trigger that got me to do an MBA!
  • "You've got the defaults of your qualities". From a particularly clueless manager...
  • "I'll show you everything I know so you can take my job. Then I will do something even more fun!". A great manager.
  • "You failed the break-even math question. Sorry, we won't hire you". After spending a full day of interview with about 10 people...
  • "You don't have enough dedication to the company". After doing way too much unpaid overtime for a company that was shortly after purchased and cut 10-15% staff.
But I got asked a few times "where do you see yourself in 5 years" and my answer was always the same: I want to share my knowledge and be recognize as an expert in my field. Some viewed it as "inflated ego". Credit goes to a book called "Becoming a technical leader"...

I didn't want to be a boss, nor did I want to be rich. Simply that I wanted to be trusted and continue to grow my expertise trough knowledge sharing. Thus my inclination for consulting, teaching, speaking and doing R&D.

Today I feel a lot closer to my goal.

WaW Montreal, May 14th, Cafe Melies

The last "[WAM] Web Analytics Wednesday in Montreal" goes way back to November... Spring is here, now is time for a new [WAM]!

When: May 14th, around 6:00pm, after WebCom
Where: Café Melies
RSVP: Simply send me an email
Sponsor: None, everyone pay his own drinks

The last 6 months have been busy! Come join us for a drink and catch on all that happened!
And hear about what is coming up!

eMetrics appreciation

Back from a fantastic trip in San Francisco for eMetrics. Catching up on emails and work. But I thought I should share my impressions while things are fresh in my mind.

Ok, I must admit I might be biased... as I presented on the Industry Insight day and was moderating one of the track on Tuesday.

BUT... It was an amazingly productive conference.

Way back then...

Some years ago (well... many years ago... I'm getting old!), each Internet World conference brought something new to the forefront of the industry. Remember "push technology", VRML, the early days of streaming? Without any coordination, "the industry" was moving in a direction. Although we can now laugh at some of those concepts, they nevertheless changed the face of the industry and how we use the web today.

The same pattern is happening at eMetrics. Without any preliminary coordination, there are some things setting the path for the future.

Testing & beyond

I noticed this year eMetrics was a lot about the value of "testing". Bringing the "testing" culture and the right tools to do it in order to optimize and achieve success. Contrary to most other field of expertise, the Web allows us to deploy quickly and continually improve. You don't want to do that with your car, your house or the space shuttle... but with the web it's how it should be. Most companies don't understand that and still impose strict project cycles, those who understand are not only demonstrating huge benefits.

From IT, to marketing, to business

The other outcome, highlighted during the Industry Insight afternoon round table and in Thomas Davenport's great keynote is the transition from web analytics to business analytics. Just like the web, web analytics started in IT, then marketing found out about it and took control. We are there now. But winning businesses understand the value of the web and have optimized some of their most important business processes around it. We are maturing to a level where we won't only talk about using web analytics for marketing optimization, but we will be talking about analytics for business processes optimization and strategic level changes.

Being a tutor of both web analytics and business process analysis classes, it's obvious to me there are very strong benefits in leveraging analytics to optimize business processes.

What will be this fall highlight? Next year?

Industry Insight

Jim Sterne told me the experience of the Industry Insight was very positive and will be renewed this fall in Washington: leading experts sharing their view of the current and future state of the industry. I will be there to bring some hard core data about the vendor market shares and exchange with fellow analysts.

All things web analytics

Over a year ago I helped the WAA create the official Web Analytics Association Search Engine. I started the Web Analytics Conversations at the same time. Those two services are becoming more popular and I have created a dedicated page for them:

>>> Free ressources for the Web Analytics Association

You will find more details about:
  • The official WAA Search Engine and how to integrate it to your own blog
  • The Web Analytics Conversations
  • The Search widget for your blog or site
  • The browser search toolbar extension
  • The iGoogle WAA Search widget

Live from eMetrics: got interviewed by Robert Scoble

eMetrics is getting close to an end and it's been great for me! Great from a learning perspective, great networking, and great opportunities for my startup.

I was walking down the hallway and bumped into Jim Sterne and Robert Scoble. Robert just interviewed Jim and he was kind enough to mention what I'm doing with WASP. A few minutes later a short video interview was posted on Qik, Robert's "from your phone to the web" platform. Spontaneous, short, quick. I like that. Thanks Robert!

If the video above doesn't work, head over to Qik to watch my interview about WASP.

eMetrics: Davenport, Slanted Door and Lobby bar

What a day! This is my third time speaking at eMetrics since last year and it's getting better every time. The conference is growing in size and there are now numerous tracks to satisfy beginners as well as more experienced practitioners. There are also numerous "unofficial" activities, as you will see.

Tom Davenport: beyond web analytics

As I said in my previous post, I had the privilege to participate in the Industry Insights day. We concluded by a round table where we shared our opinions about the state of the web analytics industry and where we see it heading. I read Tom Davenport's "The Attention Economy" a while back and I'm halfway trough "Competing on Analytics" and I already felt it was alligned with what I thought.

I loved Davenport's keynote! He is not only a great speaker, funny and full of interesting anecdotes, he should also be considered a guiding light toward what is bound to be the future of web analytics: analytics and business optimization.

Here's some random quotes from the book and from his keynote:
  • "It is not my job to have all the answers, but it is my job to as lots of penetrating, disturbing and occasionally almost offensive questions as part of the analytic process that leads to insight and refinement". Gary Loveman
  • "Do we think or do we know?". Gary Loveman
  • "In God we trust, all others bring data". Sara Lee Baker
Once I have completed my reading I will post a more extensive review of the book and my takes on it. In the meantime, head over to "In God we trust, all others bring data" for a great review.

Testing, testing

Bryan Eisenberg did, as usual, a great presentation. This time he was introducing tidbits of his upcoming book "Always Be Testing: The Complete Guide to Google Website Optimizer". This book is bound to be a category leader. I wish I had taken note of the table of content he showed us, but from what I remember, it looks like it will be a great introduction to the concepts and methods of online testing. Bryan told me he will share a pre-release copy, so stay tuned for some early reviews!

Google Analytics v3.0: I was wrong... but...

Remember my post from a few days ago, where I speculated about Google Analytics v3.0? Ok, I was "slightly" off... But... When I asked Avinash Kaushik shortly before his presentation he said something like "You will be disapointed... but I shared your idea with the team. I told them Stéphane wants this, so we need to do it" in his always musing and friendly tone. Avinash, you are great! :)

I'm supposed to get enroled in the Google Analytics for Blogger beta program. Stay tuned for more info.

Slanted Door

Once the tracks and sessions are over, the "unconference" can start. Dinner often ends up being a unique occasion to network and share on all kinds of topics related to web analytics (or not!). Sunday night was an intimate dinner with my friends Joseph Carrabis, René Dechamp Otamendi and eMetrics event coordinator Matthew Finlay.

Last night Ian Thomas and his team invited a couple of us to The Slanted Door, a great fusion-asiatic restaurant. Along with the Carrabis, Dechamp, Finlay and others, Jim Strene and Bryan Eisenberg contributed to a great dinner and great fun!

Of course, as the tradition goes, we ended up at the lobby bar and beyond... A good scotch and the traditional Belgian chocolate from René summed it up for the night.

Time to run...

Today I'm moderating the Marketing Optimization Management track. I also noticed there's a lot of interest today for optimization and multivariate testing with keynotes from Omniture, Optimost and Interwoven.

Time to go! Stay tuned for more insights from eMetrics!

eMetrics San Francisco: Industry Insight

I thought entitling this post "Continental airlines: redux" to reference my horror story from a few weeks ago. Here's another one (yeah... I should have known better!) or skip that part and jump to the Industry Insights.

Continental airlines: strike two

It was supposed to be a smooth traveling day: leaving Quebec city toward Detroit, then San Francisco. Turned out this time the plane was late 4 hours because of the bad weather in central US.

The plane from Detroit to SF would be at 7:00pm, basically wasting my whole day... So I asked to reschedule trough New York to take a plane around 12:30, another half hour later than planned. I knew it would be short in New York, where I had to go trough customs. Of course, I got a pretty stiff custom guy who was very friendly... too friendly... making jokes and wasting even more of the little time I had. Then it was security after picking my bags, another check to put back the bags on the next plane... one last security check...

I ran for nothing... the plane was late another 30 minutes. Finally got on the plane... taxied for several minutes and waited on the tarmac even more. Then we saw the maintenance trucks come in... bad news. "Mechanics told us they now know what is the problem and it will take 20 minutes to fix".

20 minutes turned to 2 hours, having us back to the gate and getting off the plane. Then I realized I was actually exactly at the same gate where I got stuck the last time! Could it be Ground Hog Day? Or maybe one of those hidden camera prank?

I finally got to the hotel around midnight... roughly 12 hours later than I was supposed to.

Industry Insights

There were two pre-conference events on Sunday: WAA basecamp and Industry Insight. I was very glad to present some of the web analytics vendor market share insights I gained trough WASP data. Morning was spent filling our brain with lots of data and bringing us in the mindset for the afternoon: what do we want our industry to be? Where is it headed? I found the discussions from the Industry Insight event to be amazingly interesting: people were very experienced analysts, great subjects and challenging ideas.

I don't want to go into too many details about what was discussed and what were the outcomes as this will be presented here on Tuesday under the very appropriate title of "Insights from Industry Insights Day". Stay tuned!

Davenport: Competing on Analytics

I read Thomas Davenport's book "The Attention Economy" a while back and it changed the way I think. I wanted to read "Competing on Analytics" for a while and with all the time I had at the airport and on my way in I'm already half-way through. Thomas Davenport is Monday morning's keynote and I'm sure it will reinforce my opinion that web analytics as we know it today; very marketing centric; is going to lead way to business analytics that will drive strategy and process optimization way beyond the limits of the web.