When you visit BuzzFeed, they record lots of information about you.
Most websites record some information. BuzzFeed record a whole ton. I’ll start with the fairly mundane stuff, and then move on to one example of some slightly more scary stuff.
First: The Mundane Bits
Here’s a snapshot of what BuzzFeed records when you land on a page. They actually record much more than this, but this is just the info they pass to Google (stored within Google Analytics):
Here’s a description of what’s going on there:
The first line there is how many times in total I’ve visited the site (above this, which I’ve skipped for brevity, it also records the time I first visited, and a timestamp of my current visit).
Below that, the ‘Custom Var’ block is made up of elements BuzzFeed have actively decided “we need to record this in addition to what Google Analytics gives us out of the box”. Against these, you can see ‘scope’. A scope of ‘1’ means it’s something recorded about the user, ‘2’ means it’s recorded about the current visit, ‘page’ means it’s just a piece of information about the page itself.
There you can see other info they’re tracking, including:
- Have you connected Facebook with BuzzFeed?
- Do you have email updates enabled?
- Do they know your gender & age?
- How many times have you shared their content directly to Facebook & Twitter & via Email?
- Are you logged in?
- Which country are you in?
- Are you a buzzfeed editor?
- …and about 25 other pieces of information.
Within this you can also see it records ‘username’. I think that’s recording my user status, and an encoded version of my username. If I log in using 2 different browsers right now, it assigns me that same username string, but I’m going to caveat that I’m not 100% sure they’re recording that it is ‘me’ browsing the site (ie. that they’re able to link the data they’re recording in Google Analytics about my activity on the site back to my email address and other personally identifiable information). Either way, everything we’ve covered so far is quite mundane.
The Scary Bit
The scary bit occurs when you think about certain types of BuzzFeed content; most specifically: quizzes. Most quizzes are extremely benign – the stereotypical “Which [currently popular fictional TV show] Character Are You?” for example. But some of their quizzes are very specific, and very personal.
Here, for example, is a set of questions from a “How Privileged are You?” quiz, which has had 2,057,419 views at the time I write this. I’ve picked some of the questions that may cause you to think “actually, I wouldn’t necessarily want anyone recording my answers here”.
When you click any of those quiz answers, BuzzFeed record all of the mundane information we looked at earlier, plus they also records this:
Here’s what’s they’re recording there:
- ‘event’ simply means something happened that BuzzFeed chose to record in Google Analytics.
- ‘Buzz:content’ is how they’ve categorised the type of event.
- ‘clickab:quiz-answer’ means that the event was a quiz answer.
- ‘ad_unit_design3:desktopcontrol’ seems to be their definition of the design of the quiz answer that was clicked.
- ‘ol:1218987′ is the quiz ID. In other words, if they wish, they could say “show me all the data for quiz 1218987″ knowing that’s the ‘Check Your Privelege’ quiz.
- ‘1219024’ is the actual answer I checked. Each quiz answer on BuzzFeed has a unique ID like this. Ie. if you click “I have never had an eating disorder” they record that click.
In other words, if I had access to the BuzzFeed Google Analytics data, I could query data for people who got to the end of the quiz & indicated – by not checking that particular answer – that they have had an eating disorder. Or that they have tried to change their gender. Or I could run a query along the following lines if I wished:
- Show me all the data for anyone who answered the “Check Your Privelege” quiz but did not check “I have never taken medication for my mental health”.
In BuzzFeed’s defense, I’m sure when they set up the tracking in the first place they didn’t foresee that they’d be recording data from quizzes of this personal depth. This is just a single example, but I suspect this particular quiz would have had less than 2 million views if everyone completing it realised every click was being recorded & could potentially be reported on later – whether that data is fully identifiable back to individual users, or pseudonymous, or even totally anonymous.
What do you think?