Tuesday, 29 March, 2011

Web analytics ethic: from theory to practice

A week ago I published three short cases where people were invited to comment on whether they were legal, ethical and abiding by their web analytics vendor Terms Of Service (TOS). Inspired from my own experience and after much talk about the WAA Code of Ethic, sessions at the recent eMetrics and discussions I had with some vendors, I thought participation would be much higher.

Here’s my point of view and some info from the brave souls who were up for the task! You should really read the previous post before continuing!


Photo: stock.xchng
Disclaimer: I'm not a lawyer nor a specialist of ethics - this information is provided as is... do your homework!

The majority of the 14 respondents were from the US and UK with some participants from Canada and other European countries. Unsurprisingly, most respondents said they were using Google Analytics.

Case #1: matching transaction id against back-end.

Unsure No Yes
It is legal 20% 0% 80%
It is acceptable based on my TOS 27% 20% 53%
It is ethical 13% 13% 73%

In my opinion, this is perfectly legal – the data was collected with user consent in the context of a commercial relationship. It is also ethical – it is common and accepted to send a “thank you” email, along with the purchase details and some offers. The fact it is sent through traditional snail mail doesn’t matter – or does it? Since the transaction was done online, there is usually an expectation communications will also be conducted online. As one of the respondents put it, “At the end of the day, 'ethical' depends more on your relationship with your customer than anything else”. All serious tools vendors TOS specifically prohibit sending Personally Identifiable Information (PII) to their system.

A transaction id, which is clearly not PII, is typically set by your back-end system and stored in your web analytics service of choice. This is a piece of data coming from your own system, and used back to merge against it, generally no TOS issue – except with Google Analytics TOS! (emphasis mine)
7. PRIVACY . You will not (and will not allow any third party to) use the Service to track or collect personally identifiable information of Internet users, nor will You (or will You allow any third party to) associate any data gathered from Your website(s) (or such third parties' website(s)) with any personally identifying information from any source as part of Your use (or such third parties' use) of the Service. You will have and abide by an appropriate privacy policy and will comply with all applicable laws relating to the collection of information from visitors to Your websites. You must post a privacy policy and that policy must provide notice of your use of a cookie that collects anonymous traffic data.
Repeat: "You will not associate any data gathered from your website(s) with any personally identifiable information from any source as part of your use of Google Analytics". Essentially, if you use Google Analytics, you should not extract transaction ids to merge them back against your own system. This is a non-sense to me and I know of several organizations that are actually doing it – probably without realizing they are breaking their GA TOS. Let’s hope this will be revised.

Case #2: matching product id (SKU) against back-end

Unsure No Yes
It is legal 7% 0% 93%
It is acceptable based on my TOS 27% 0% 73%
It is ethical 7% 0% 93%

Legal, ethical and no TOS issue. The key element here is that no PII is involved. From a business standpoint, what’s interesting is the ability to use behavioural data to correlate with sales in order to build a predictive model where we “know” which online behaviours are early indicators of upcoming sales and therefore, adjust inventories accordingly.

Case #3: key created from (potential) PII without user consent

Unsure No Yes
It is legal 33% 20% 47%
It is acceptable based on my TOS 53% 40% 7%
It is ethical 33% 33% 33%

If I got it right, in the US: last name alone, 5 digits zip code or last digits of phone number are not considered PII.

However, in California, OPPA specifies what is typically a non-PII become PII when combined with other data (such as having gender associated with a specific person). In Canada, the PIPEDA law stipulates data must be collected with user consent and used for the purpose it was collected for. In Europe, and especially Germany, a last name is PII (so are IP addresses and a whole bunch of things!).

Is it ethical? In this specific case, the data is stored even if the transaction isn’t fully completed. Therefore, this practice is against the 3rd WAA Code of Ethic guideline: User Control. It is also against PIPEDA in Canada.

What about the TOS? In general, this wouldn’t be an issue and it doesn’t really matter if this string is further encoded to obfuscate it. However, Google Analytics TOS still doesn’t allow us to use this key to merge with any other data that could contain PII.
In airports, the stand by list typically shows first three letters of last name and first letter of first name

My take

While there are passionate arguments on "free vs paid" in the #measure tweet universe, I was sincerely disappointed a topic like ethic and legal didn’t raise much interest. Is it because of a lack of interest? Fear of being wrong?

Either way, it makes me wonder if web analysts happily embrace the WAA Code of Ethic because it feels good and it's a worthy cause... or are just full of it! I guess what’s most important for now isn’t to know all there is to know about ethic, legislations and TOS, but to take action when innapropriate situations are uncovered.

I don't pretend to know more than anyone else, in fact, I'm willing to be wrong! If you have comments or additional useful references, I would love to hear from you!

9 comments:

It's interesting that there seems to be a correlation between legality and ethics in the minds of your respondents. To me, the Code of Ethics is there as a flag against practices that are deemed unethical by the community, rather than deemed unethical by law.

Great post, Stephane, and the fact that you're as concerned with the lack of response to the survey as you are with the actual responses is well-placed.

A co-worker yesterday showed me a picture of the "Your Data for Sale" cover from the 3/21 issue of Time. My reaction was a little embarrassing: "Yeah. Privacy is a hot topic, and it's damn complicated -- in the U.S., you've got Congress and the FTC, and in Europe you have the EU (hmmm...what action is going on in Canada? I don't know). It's really, really messy." What I found embarrassing was that this was coming on top of me having taken your survey and feeling a lot like I was guessing at my answers (although I seemed to wind up pretty close to where you landed).

The challenge, it seems, is how to break things down beyond a feel-good, read-once-and-signed Code of Ethics to something more directly applicable -- a decision tree, maybe, that could be evolved over time as the varying standards and regulations crystallize? That's a tall order. And, with analysts already being challenged to be "center brained" and speak both the language of data *and* the language of business (marketing, primarily), adding on a layer of the language of legal/regulatory... Ouch.

The whole tie order ID to GA data is very tricky. At GAUGE Phil Mui (Google Analytics Group Product Manager) alluded to the fact that this section specifically MAY be revisited. He did not say which direction they would be going with it, but they know that it is a very grey area and are going to do something to clear it up.

Emer: interesting comment - maybe in some ways ethic is bounded by social values, while legal is bounded by commercial ones (as in protecting my "assets" - right to privacy).

Tim: to your point about a decision tree, maybe what we need are more cases, more examples, and more people chiming in and sharing their thoughts.

David: In fact, I shared those three cases with Phil and I was in the room at GAUGE when Alex Langshur specifically asked about the order id thing - the answer Phil gave is pretty much the same I got through email: the GA TOS may be revised eventually... Through unofficial channels, what I'm hearing is they are not running after people doing this type of thing...

Cela veut dire que Métro ne serait pas conforme. Par exemple, dans cet extrait d'URL provenant d'avoir cliqué sur leur email promotionnel:

utm_source=Infolettre%2B28%2Bmars%2B2011&utm_medium=Courriel&utm_term=650533&utm_content=Metro%2BGP

Le paramètre utm_term est utilisé pour placer un code identifiant le destinataire. Au début, il plaçait carrément l'adresse email jusqu'à ce que je les avertisse via le formulaire de contact (message dont ils n'ont pas accusé réception), alors ils ont remplacé l'adresse email par cette "encryption".

Si on the suit bien Stéphane, même cela contreviendrait aux politiques de GA. Est-ce donc dire qu'on ne peut jamais rien utiliser qui ferait en sorte de le lier à un utilisateur?? Cela rendrait GA à toute fin pratique inutile...

(It's 5:30 AM and I just realized I wrote this in French!)

Jacques a) in the case you are presenting, sending an email address on the query string is always unethical and sometimes illegal (depending on jurisdiction).
b) for GA, using utm_term tosend a key is ok according to the TOS - what's not is later extracting that key to merge it agains PII (like the subscribers database). Note that this is ethical & legal... Just not conform to GA current TOS...

I know... that's stupid... and likely to change...

Yes, I know the key is not unethical, but this is obviously to merge with PII. If GA does not change their policies, it will, in my opinion, render the product useless to a lot of online Marketing projects.

My bet is that there's already a good portions of companies that do not know that and that are in infringement.

Jacques: I specifically asked Avinash & Phil Mui (Product Manager of GA) about this element of the TOS and the official answer for now is: "the GA TOS MAY be changed in the future"... with v5 rolling out and some incredible features (sorry... can't tell, I'm under NDA), if I read between the line, this TOS concern will be resolved soon...

Interesting... We all wish we were under that NDA!

Then, if Google would abide by their own book, until that TOS is officially changed, they should send a clear message of "cease and desist" to every account merging the data they collect in any way, shape or form, with customers' PII.

My bet is that would make many marketers go "huh?" and reconsider.