Postal Code Data Now in Google Analytics

Google Analytics is now recording Post Code data for visitors to websites across Europe.

Rachel McCombie of Air Experiences was the first to spot the change.

Here’s an example for the UK, where over the last few days roughly 5% of ‘city’ traffic has been recorded with a postal code rather than a city name (the percentage varied by account):

(Note that this only contains the first portion of postcodes, which gives a smallish region, but not enough data to personally identify someone.)

The data is being recorded across other European countries too. For example here’s a snapshot of some postal code data being recorded in Germany – home of the toughest data protection laws in Europe. (Across a few accounts, German postal code data was being recorded for roughly 1.5% of all country sessions):

Similar data appears to now be flowing into accounts across many European countries, as pointed out by Benoît Perrotin:

The data appears to have begun trickling in on August 27th, with a much greater flow on the 28th:

Positives & Negatives

There are some big positives of this, but also a few negatives:

Positives:

  • This is great for direct mailers, whose businesses are very much focused around postal regions.
    • If it’s accurate, it allows you to judge response from particular regions.
    • Allows you to attribute sales to catalogues that you previously may not.
  • It’s probably good for charities, political parties, and other campaigners.
    • Many of these businesses have a ‘local’ focus, for example political parties tailoring messaging by postal code, and using local volunteers.
  • It’s good for any business with retail outlets. Rather than the arbitrary ‘city’ names, that often included small towns, postal codes are
  • It also means you can match up your data more easily with other sources:
    • Returns data for retailers.
    • Third party demographic data.
    • Population data, to understand your traffic in areas vs the actual size of the population.

The Caveats:

The first caveat is, we do not know how this data is being collected, or why it seems only to cover a percentage of traffic.

The second caveat is, it’s unlikely that the data here would be completely accurate & comprehensive. That being the case, you’d only ever likely get a sample for any given area, and it would be difficult to tell whether those samples were evenly sized by region (eg. if I’m told I have 100 visits from postcode A and 200 from postcode B, does that mean I actually got double the number of visits from the latter, or just that fewer visits from the first area were correctly classified?)

The biggest caveat on all of this is that – at present – the data is being recorded in the ‘City’ field within Google Analytics. That makes things a bit of a mess: Some of the data is still recorded as city/town names, some is now recorded as postal codes. That means firstly that the data can’t be used with much confidence (eg. if you see a postal code, you have to ask “is that all of the data for that region, or is some of it grouped under a town name somewhere?)

Update: A final caveat is: Martin Macdonald spotted an oddity with an ‘ME14′ postcode appearing to be very popular. On further digging, the same postcodes seem to appear again & again among the top 5 for different accounts. Speculation (from @scottjlawson & @davecatley) is these may be large internet exchanges/providers.

How to Find the Data

The simplest way to reach the data is to navigate to:

  • Audience > Geo > Location.
  • Click the ‘City’ Primary Dimension (just to the top left of the main table listings)
  • Use the search filter box, placing the following filter into it: “[0-9]” (including braces, excluding quotes). This essentially says ‘show me any results that contain any number’, which matches most postal codes across Europe (and of course excludes City names, which do not contain numbers)

Alternatively, the following Google Analytics Custom Report will give you a quick snapshot of Sessions listed by Postal Code and Country:

Thoughts?

My hope is that this data will remain in Analytics. Ideally it would be in an additional dimension (‘Postal Code’) rather than being shoe-horned into the ‘City’ field, and would cover the US and other regions.

If you have any thoughts about any of this, do share them with me (@danbarker) on Twitter, or leave a comment below.

Do English People want Scotland to Stay Part of the UK?

Here are the results of 2 surveys. The first is one that’s been run quite a lot by polling organisations over the last couple of years. The second one is much rarer. Here are the descriptions:

  • Survey 1: A survey of 1,000 people in Scotland, carried out over the web, asking the simple quesiton “Should Scotland be an independent country?”
  • Survey 2: A survey of 500 people in England, again carried out over the web, asking the same question: “Should Scotland be an independent country?”

I thought it was strange the second question had not been asked more often, so thought I would run a poll myself.

Scottish Results

Here are the overall results of 1,000 people in Scotland being asked the question “Should Scotland be an independent country?”

As you can see – very, very close. So close in fact that – if you look at those ‘+/-4.6%’ error bars, it’s literally too close to call based on the 1,000 people surveyed.

(I’ve included error bars, so you can see the margin of error (and whether they are meaningful), and broken them down by age & gender too.)

Scottish Results by Age

Here are the results split by age, for the 2/3 of respondents where the age was known.

As you can see, the largest ‘Yes’ lean is among those 35-44; the largest ‘No’ lean is 18-24. This is interesting, but, as these are small groups of respondents I would not draw any conclusions based on this. (see the error bars, for example)

Scottish Results by Gender

Here are the Scottish results, split by gender:

Again, that’s leaning toward ‘Yes’  for male, and ‘No’ for female, but too close to call.

English Results

I ran exactly the same survey across 500 people in England – asking the question “Should Scotland be an independent country?” (worded exactly, I believe, as the official ballot question).

This time, the results were very, very different:

English Results by Age

Splitting this out by age, the results are very similar across all brackets (albeit note the error bars again here – these are very small samples in each group, so far from exact).

English Results by Gender

Again, splitting by gender we see a similar picture: English people do not want Scotland to be an independent country.

Caveats

It’s always worth caveating this kind of survey (in fact that’s true of almost all data). This is not an election. I did not ask “If you were to vote today, how would you vote when asked the question ‘Should Scotland be an independent country?’” – I simply asked the actual ballot question Scottish voters will be asked.

You’ll notice that I surveyed 500 people in England here, and 1000 in Scotland (note ‘in England’, ‘in Scotland’ rather than English/Scottish). The reason for that was I started by surveying 500 people in each. The “England” results were so conclusive I stopped. There was no clear winner in Scotland, so I ran for another 500 responses. Again, too close to call.

The final obvious caveat is: I haven’t surveyed Northern Ireland or Wales here. If you’d like me to do that, feel free to add a comment on the post. And – if you’d like me to survey more people in England if you feel doing so would alter the outcome – feel free to drop me a note too.

Overall Summary

The polls here are snapshots of 2 particular audiences. Based on those audiences, we can say:

  • The audience within Scotland are not sure whether they want Scotland to be part of the United Kingdom or not. The results are quite literally too close to call. (Voting one way or the other is another matter, where actual risk & proactive effort are both involved)
  • The English audience on the other hand are are very, very, very much swayed one way: they want Scotland to remain part of the UK.

Do share this with others if you think they may be interested.

LinkedIn’s Sneaky UX Trick

Occasionally at the moment when you click a LinkedIn notification on your phone, the LinkedIn app opens and – before you’re taken to the notification – you’re presented with this screen:

That looks fairly mundane at first glance. Often apps present you with information before taking you to wherever you were going. But… if you click ‘Continue’ there, you’re actually saying “I agree to LinkedIn importing my phone address book, and storing all of those personally identifiable details within their database”.

I think that’s a bit sneaky for 3 reasons:

  1. At first glance it’s not obvious that clicking ‘continue’ will do something as big as import your address book (!)
  2. This appears when you click on a notification from your phone (eg. someone new connecting to you). In that context, ‘Continue’ feels like it means ‘Continue on to where we were taking you’; not ‘Continue importing the details of all of my friends/family/colleagues’.
  3. There is no ‘skip this’ option at all. Your 2 choices are ‘Continue’ (presented in high-contrast) or ‘Learn More’ (presented in grey-on-grey text).

I think: it feels like this has been done with the intent of getting people to click ‘Continue’ without realising what they’re doing, or because it appears to be their only real choice.

What do you think?

The Real Original Source of the Phrase “Big Data”

Big Data

In early 2013, Steve Lohr of the New York Times published an article where he tracked down the origin of the phrase “Big Data”. He found several different sources, and declared that it originated in the mid-1990s. But… he specifically opted to conclude that the very earliest source he could find – from 1989 – was not the originator. His reasoning was based on 2 factors:

  1. He wanted to credit someone who used the phrase in a technical way: “The credit, it seemed to me, should go to someone who was aware of the computing context.”
  2. He did not feel that the original usage of the phrase fitted the same idea of ‘Big Data’ as his. He therefore concluded the first usage was: “not, I don’t think, a use of the term that suggests an inkling of the technology we call Big Data today.”

I read Steve’s article at the time, where he declared that the first ever use of “Big Data” was not the originator, and thought “that’s a little unfair”. I keep going back to it, because the first source he found, and apparently the original usage of the phrase “Big Data” was very insightful, and covers perhaps the two biggest issues in relation to data today: its massive worth from a corporate point of view, and its massive privacy implications from a consumer point of view.

The original article was published on July 26th, 1989, under the headline “How Did They Get Your Name? Direct-mail Firms Have Vast Intelligence Network Tracking Consumers”. It was written by Erik Larson (now a best-selling author). The article talks about organisations gathering, joining, and mining data on millions of people, to use for marketing purposes. Here are a couple of example paragraphs:

“We’ve been scavenged by data pickers who sifted through our driving record and auto registrations, our deed and our mortgage, in search of what direct mailers see as the keys to our identities; our sexes, ages, the ages of our cars, the equity we hold in our home.

The scavengers record this data in central computers, which, in turn, merge it with other streams of revelatory data collected from other sources – the types of magazines we subscribe to, the organizations we support, how much credit we’ve got left – and then spit it all out (for a price) to virtually anyone who wants it.”

It goes on to talk about future implications of all of this:

It is an interesting exercise to imagine the big marketing databases put to use in other times, other places, by less trustworthy souls. What, for instance, might health insurers do with the subscription lists of gay publications?

Despite the dated & simplistic example, this is of course what many people today worry about: what governments try to regulate, where companies spend millions setting up & utilising systems, what we use in real time to deliver relevant ads to people as they browse websites, and – with a little stretching – what much of the NSA/Edward Snowden stuff was about. It is an article from 1989 talking about one of the biggest issues in technology today. And there, in the middle, is the first ever usage of the phrase “Big Data”:

bigdataquote

There’s a copy of the original article over on the Orlando Sentinel website, ironically now full of real-time targeted ads. Erik Larson later released a book expanding on the topic “The Naked Consumer: How Our Private Lives Become Public Commodities”. Despite being 25 years old, both the article and the book essentially talk about one of the versions of the phrase “Big Data” we use today: a cornerstone of modern marketing from a corporate point of view, and a privacy worry from a consumer point of view for many.

BuzzFeed is Watching You

When you visit BuzzFeed, they record lots of information about you.

Most websites record some information. BuzzFeed record a whole ton. I’ll start with the fairly mundane stuff, and then move on to one example of some slightly more scary stuff.

First: The Mundane Bits

Here’s a snapshot of what BuzzFeed records when you land on a page. They actually record much more than this, but this is just the info they pass to Google (stored within Google Analytics):

Here’s a description of what’s going on there:

The first line there is how many times in total I’ve visited the site (above this, which I’ve skipped for brevity, it also records the time I first visited, and a timestamp of my current visit).

Below that, the ‘Custom Var’ block is made up of elements BuzzFeed have actively decided “we need to record this in addition to what Google Analytics gives us out of the box”. Against these, you can see ‘scope’. A scope of ’1′ means it’s something recorded about the user, ’2′ means it’s recorded about the current visit, ‘page’ means it’s just a piece of information about the page itself.

There you can see other info they’re tracking, including:

  • Have you connected Facebook with BuzzFeed?
  • Do you have email updates enabled?
  • Do they know your gender & age?
  • How many times have you shared their content directly to Facebook & Twitter & via Email?
  • Are you logged in?
  • Which country are you in?
  • Are you a buzzfeed editor?
  • …and about 25 other pieces of information.

Within this you can also see it records ‘username’. I think that’s recording my user status, and an encoded version of my username. If I log in using 2 different browsers right now, it assigns me that same username string, but I’m going to caveat that I’m not 100% sure they’re recording that it is ‘me’ browsing the site (ie. that they’re able to link the data they’re recording in Google Analytics about my activity on the site back to my email address and other personally identifiable information). Either way, everything we’ve covered so far is quite mundane.

The Scary Bit

The scary bit occurs when you think about certain types of BuzzFeed content; most specifically: quizzes. Most quizzes are extremely benign – the stereotypical “Which [currently popular fictional TV show] Character Are You?” for example. But some of their quizzes are very specific, and very personal.

Here, for example, is a set of questions from a “How Privileged are You?” quiz, which has had 2,057,419 views at the time I write this. I’ve picked some of the questions that may cause you to think “actually, I wouldn’t necessarily want anyone recording my answers here”.

When you click any of those quiz answers, BuzzFeed record all of the mundane information we looked at earlier, plus they also records this:

Here’s what’s they’re recording there:

  • ‘event’ simply means something happened that BuzzFeed chose to record in Google Analytics.
  • ‘Buzz:content’ is how they’ve categorised the type of event.
  • ‘clickab:quiz-answer’ means that the event was a quiz answer.
  • ‘ad_unit_design3:desktopcontrol’ seems to be their definition of the design of the quiz answer that was clicked.
  • ‘ol:1218987′ is the quiz ID. In other words, if they wish, they could say “show me all the data for quiz 1218987″ knowing that’s the ‘Check Your Privelege’ quiz.
  • ’1219024′ is the actual answer I checked. Each quiz answer on BuzzFeed has a unique ID like this. Ie. if you click “I have never had an eating disorder” they record that click.

In other words, if I had access to the BuzzFeed Google Analytics data, I could query data for people who got to the end of the quiz & indicated – by not checking that particular answer – that they have had an eating disorder. Or that they have tried to change their gender. Or I could run a query along the following lines if I wished:

  • Show me all the data for anyone who answered the “Check Your Privelege” quiz but did not check “I have never taken medication for my mental health”.

In BuzzFeed’s defense, I’m sure when they set up the tracking in the first place they didn’t foresee that they’d be recording data from quizzes of this personal depth. This is just a single example, but I suspect this particular quiz would have had less than 2 million views if everyone completing it realised every click was being recorded & could potentially be reported on later – whether that data is fully identifiable back to individual users, or pseudonymous, or even totally anonymous.

What do you think?

.UK Domains Launched – Sorry!

On June 10th 2014, at 8am, Nominet (the UK domain registry) launched “.uk” domains. In other words, I could now move this site to “http://barker.uk” rather than “http://barker.co.uk”.

To announce the launch – the biggest change to UK internet addresses in many, many years – Nominet have launched what they call “the world’s largest welcome sign”, visible from 35,000 feet. Here’s how the Daily Mail described this enormous sign:

Sadly – here’s what you see if you visit the URL on the world’s largest welcome sign:

A shame to have launched the world’s largest welcome sign leading to a large “Sorry…” notice, and a nice lesson to remember to double check your landing pages when running multi-channel campaigns.

Note: If you’d like a full summary of the .uk change, what it means, and what to do about it, feel free to leave a comment and I’ll update this post later.

The John Lewis Email Spam Fine

Part of the email marketing industry in the UK is built around this phrase:

‘in the course of a sale or negotiations for the sale of a product or service’.

Those are the conditions under which – if you have collected an email address – you are allowed to send marketing emails (b2c), even if they have not explicitly opted in to receive mail from you.

Most sites assume signing up for an account, or beginning a checkout process to fall within ‘negotations for the sale of a product service’. As a result, they consider it perfectly ok to send you abandoned basket emails if you have begun checkout, and it’s fairly standard practice to email users who have registered for an account with you, as long as they have not specifically opted out.

Here’s how the Information Commissioner’s Office talk about this:

John Lewis essentially did exactly that, or considered they had. Here is how the man who took them to court (a Sky News producer) described John Lewis’ argument: (from http://news.sky.com/story/1272933/spammer-to-pay-damages-after-court-victory):

To be clear: What John Lewis were doing here is considered fairly good practice. The user signed up for an account. They had the opportunity to opt out & did not. Yet the court still considered it spam & issued a fine.

What does this mean for email marketing? 

If you are a business or a website owner:

  • It may mean you should relook at the wording on your website to make it clear that an account signup is considered ‘negotiation toward a sale’.
  • It may mean you need to speak to your abandoned basket email provider to ask “are we definitely covered here? If not, what do we need to do?”
  • It may mean that your ‘opt out’ box should be more prominent after signing up & that you highlight that the sign up is considered the beginning of a relationship.
  • It may mean you should check through how your existing email addresses have been acquired a little more thoroughly.
  • It may mean some sites need to watch out for scammers, putting in spam claims to try and win the fine money.
  • It may even mean you need to move to double opt-in, or more heavily confirm opt-ins, as – of course – anyone can enter anyone else’s email address on a form, it is not necessarily confirmation from the actual email owner that they wish to receive your communications.
  • It may mean you should think about not emailing users unless they have explicitly ticked a box, even though the Information Commissioner says it’s fine to do just that under some conditions.

Or, this may just be a fluke, and another court may decide a similar case entirely differently.

UKIP – Powered By Foreign Technology

The United Kingdom Independence Party (UKIP) have launched a new advertising campaign. It hinges on 2 key messages:

  1. Foreign labour is damaging the UK.
  2. Much of UK law is controlled from overseas.

Here are two of their posters covering these issues:

 

Based on this, you may think they’d be keen on UK technology. Yet here’s the technology behind UKIP’s website:

 

Even their domain name is not from the UK: the “.org” in UKIP.org is governed from the USA.

The Mirror’s Crying Child Photo – Not All That it Seems

Here’s the front cover of the Daily Mirror. A haunting image of a starving British child, crying their eyes out.

Only… the child is from the Bay Area, and the photo was purchased from Flickr via Getty Images…

Embedded image permalink

Here’s the source of the original image: https://www.flickr.com/photos/laurenrosenbaum/4084544644/ (Here’s a happier one taken the following day: https://www.flickr.com/photos/laurenrosenbaum/4086511962/. Apparently she was crying over an earthworm.)

An excellent photo, taken by the excellent Lauren Rosenbaum in November 2009, shared on a US website (Flickr), sold by an American photo agency (Getty Images), used to illustrate poverty in Britain.

  • Does it matter that the photo is not really a starving child?
  • Does it matter that the photo wasn’t even taken in the UK?
  • Is there an ethical issue in buying a stock photo of a child – not in poverty – and using it to illustrate poverty?
  • Does it matter that the headline begins “Britain, 2014″, but the photo is actually “USA, 2009″?

I’m not sure on the answers to any of the above, but interesting to think about.

What do you think?

 


How the US Airways Tweet Happened

If you’re reading this, you will know that US Airways sent an incredibly lewd photo to one of their passengers in response to a complaint.

Here is the massively censored version of the Tweet:

The 2 Key Events:

  1. Very shortly before the US Airways tweet, the @ARTxDEALER Twitter account posted ‘the photo’, addressing the Tweet to @AmericanAir. (side-note: American Air & US Airways recently merged)
  2. US Airways posted a response to user @ElleRafter: “We welcome feedback, Elle. If your travel is complete, you can detail it here for review and follow up: pic.twitter.com/vbeYgXXXXX” (I’ve deliberately changed that URL to protect the innocent).

The Actual Explanation:

  • US Airways recently merged with American Air.
  • Whoever is in control of the US Airways twitter account also monitors American Air’s brand on Twitter.
  • Having seen the lewd photo sent to American Air, the social media exec copied the URL (perhaps emailing it to someone to report it, for example)
  • When they responded to @ElleRafter, instead of pasting the URL of their complaints form, they accidentally pasted the twitter image URL. In doing that, it reattached the image to their tweet.

The key piece of information is that if you copy & paste a ‘pic.twitter.com…’ Twitter photo URL into your tweet, it reattaches that photo to your tweet.

Summary: Mystery solved. The twitter account ‘@ARTxDEALER’ accidentally caused the whole thing. (I wouldn’t recommend visiting their account – not safe for work!)

Very good luck to the poor person in charge of the US Airways/American Air twitter accounts. A tough job and – from the looks of things – an honest mistake.