In early 2013, Steve Lohr of the New York Times published an article where he tracked down the origin of the phrase “Big Data”. He found several different sources, and declared that it originated in the mid-1990s. But… he specifically opted to conclude that the very earliest source he could find – from 1989 – was not the originator. His reasoning was based on 2 factors:
- He wanted to credit someone who used the phrase in a technical way: “The credit, it seemed to me, should go to someone who was aware of the computing context.”
- He did not feel that the original usage of the phrase fitted the same idea of ‘Big Data’ as his. He therefore concluded the first usage was: “not, I don’t think, a use of the term that suggests an inkling of the technology we call Big Data today.”
I read Steve’s article at the time, where he declared that the first ever use of “Big Data” was not the originator, and thought “that’s a little unfair”. I keep going back to it, because the first source he found, and apparently the original usage of the phrase “Big Data” was very insightful, and covers perhaps the two biggest issues in relation to data today: its massive worth from a corporate point of view, and its massive privacy implications from a consumer point of view.
The original article was published on July 26th, 1989, under the headline “How Did They Get Your Name? Direct-mail Firms Have Vast Intelligence Network Tracking Consumers”. It was written by Erik Larson (now a best-selling author). The article talks about organisations gathering, joining, and mining data on millions of people, to use for marketing purposes. Here are a couple of example paragraphs:
“We’ve been scavenged by data pickers who sifted through our driving record and auto registrations, our deed and our mortgage, in search of what direct mailers see as the keys to our identities; our sexes, ages, the ages of our cars, the equity we hold in our home.
The scavengers record this data in central computers, which, in turn, merge it with other streams of revelatory data collected from other sources – the types of magazines we subscribe to, the organizations we support, how much credit we’ve got left – and then spit it all out (for a price) to virtually anyone who wants it.”
It goes on to talk about future implications of all of this:
It is an interesting exercise to imagine the big marketing databases put to use in other times, other places, by less trustworthy souls. What, for instance, might health insurers do with the subscription lists of gay publications?
Despite the dated & simplistic example, this is of course what many people today worry about: what governments try to regulate, where companies spend millions setting up & utilising systems, what we use in real time to deliver relevant ads to people as they browse websites, and – with a little stretching – what much of the NSA/Edward Snowden stuff was about. It is an article from 1989 talking about one of the biggest issues in technology today. And there, in the middle, is the first ever usage of the phrase “Big Data”:
There’s a copy of the original article over on the Orlando Sentinel website, ironically now full of real-time targeted ads. Erik Larson later released a book expanding on the topic “The Naked Consumer: How Our Private Lives Become Public Commodities”. Despite being 25 years old, both the article and the book essentially talk about one of the versions of the phrase “Big Data” we use today: a cornerstone of modern marketing from a corporate point of view, and a privacy worry from a consumer point of view for many.