I have been having a spirited marathon debate with a couple of my friends. Is this alleged new “Big Data thingy” so transformational that it will change our every day lives, or is it just an evolutionary advance that may improve productivity but not much else? The same arguments may apply to the concept of “The Cloud,” and “Smart Mobile.” The three, taken together, are coalescing into the major information technology forces that will drive innovation and productivity into the foreseeable future.
PollDaddy: What Is Your Opinion? Big Data: Big Deal Or Not? Or Comment Below
We are hearing regularly in the media about so-called “Big Data.” What exactly is Big Data? A number of differing definitions have been offered from a wide range of media sources. ZDNet‘s definition is one of the best I have seen so far. In essence, big data is about liberating data that is large in volume, broad in variety and high in velocity from multiple sources in order to create efficiencies, develop new products and be more competitive. Forrester puts it succinctly in saying that big data encompasses “techniques and technologies that make capturing value from data at an extreme scale economical” Prior to the emergence of commercial Big Data, the concept only existed where cost was no object: in the black world of the National Security Administration, and required the largest purpose-built supercomputers in existence.
A zettabyte (symbol ZB, derived from the SI prefix zetta-) is a quantity of information or information storage capacity equal to 1021 bytes or 1,000 exabytes (or one sextillion (one long scale trilliard) bytes).…..I Billion terabytes….Today, you can walk into your local computer store and buy a couple of terabyes for a $100. Only $500 Million for a zettabyte. In real terms that is dirt cheap, and getting cheaper daily. Now that we have that cleared up, we can move to the next level.
With regard to the obvious issue of personal privacy, the European Union and other organizations have made efforts to protect privacy, with very mixed results. Other governments, notably China, are aggressively implementing opposite policies to strictly limit privacy. Highly sophisticated telecommunications equipment has been available for years that enables deep analysis of all of your voice and Internet traffic. We learned this when Dick Cheney secretly set up such equipment to track and record all voice and data traffic in the United States. The equipment trapped and analyzed all of it in real time. You didn’t notice a thing. The thing about your personal data is that they already have it. Most of it comes from public sources you authorized. I not advocating this, I am only the messenger. The founder and former CEO of Sun Microsystems, Scott McNealy famously said, “You have no privacy. Get over it.” We must not ignore the serious issue of privacy, but the problem is already here and deep data mining is thriving. Privacy needs a revolution of its own.
The core question then becomes whether Big Data, and for that matter, the Cloud, and Smart Mobile, represent revolutionary and transformational changes in technological capability and also consequentially, human culture, politics: how we conduct ourselves in the World. Or is it just so many more boring zeros and ones zooming by at the speed of light, stored in chips, and processed by quantum microprocessors? No big deal, just IT management as normal. Frankly, this is a significant philosophical question. For this discussion, we will focus only on Big Data. Discussion of the Cloud and Smart Mobile will follow later. My most recent post on Smart Mobile gives a hint of my thoughts: Mobile Market Share: A War of Titans Worth Following, http://mayo615.com/2013/01/21/mobile-os-market-share-strategy-war-of-the-titans-worth-following/
In fairness, I cut my teeth on Marshall McLuhan‘s ideas while in university in the 1960’s. In an amazing irony, I soon fell into Intel Corporation at the birth of the microprocessor revolution, and later, I was also present to personally participate in the emergence of the personal computer. My memory of McLuhan kept popping up everywhere. As my career progressed, I seemed to jump onto each new wave: networking at Sun Microsystems, then the Internet infrastructure build out explosion with Ascend Communications, and finally a host of new companies, based on Internet-based capabilities. Through all of it, I could only conclude that somehow McLuhan, like some kind of modern Nostradamus, had foreseen it all. Most importantly, my own life was transformed by it all, and I saw with my own eyes the massive transformation occurring all around me.
So I have no doubt that Big Data is transforming our lives, and will continue to transform our lives, in ways we cannot yet fully grasp, as I could not grasp McLuhan when I first heard him, or the significance of the Internet as I sat right in the middle of it.
I have previously described Big Data as analogous to the evolution of Chaos Theory. For centuries, full understanding of the complexity of nature’s designs were thought to be the realm of God, and beyond human comprehension and explanation. Then in the 1960’s in places like Santa Cruz, California and Germany, the elegant simplicity of a solution to chaos began to emerge. The massive scale of Big Data is a very similar nut to crack. We are now seeing an elite group of data scientists and mathematicians begin to solve Big Data in a way similar to how chaos was resolved. Google, Microsoft Bing, Baidu, Yahoo and Amazon are driving the development of these mathematical skill sets.
Last year I showed my UBC Faculty of Management students a YouTube video on Data Mining. In the video, the two Hungarian mathematicians leading a data mining company, described how they had solved hideously complex problems that were previously beyond any computational solution. The key to their success was their ability to extract very precise useful information from extraordinarily large stores of information. The metaphor here is more like finding a particular grain of sand on a very large beach. A parallel key factor has been the incessant march of Moore’s Law. Even 10 years ago, successful data mining on this scale could not have been accomplished. The computational cycles and high speed mass storage were not available or were too expensive. Today those microprocessor cycles are available. The costs will continue to plummet, making further advances inevitable. Failure to consider Moore’s Law and available computational cycles has also been the cause of many failed ideas over the years. But the threshold has arrived.
Today, developments like Google Spanner, the largest known database architecture in the World, have joined with the computational solutions.
Unveiled this fall after years of hints and rumors, it’s the first worldwide database worthy of the name — a database designed to seamlessly operate across hundreds of data centers and millions of machines and trillions of rows of information.
Spanner is a creation so large, some have trouble wrapping their heads around it. But the end result is easily explained: With Spanner, Google can offer a web service to a worldwide audience, but still ensure that something happening on the service in one part of the world doesn’t contradict what’s happening in another.
Google’s decision to reveal Spanner has many dimensions. First, it provides a peek into the black World of the U.S. National Security Agency and the U.S. Defense Intelligence Agency. Previously, the existence of such large and sophisticated global databases were only imagined. We now know they exist and are a crucial component of Big Data.
Read more in my post, Google Spanner, the single largest database in the world
For me, the most compelling example of how this all works, has been the extremely sophisticated Big Data mining used by the Obama campaign to achieve re-election. As early as March 2012, the Wall Street Journal began reporting about “Dashboard,” the Obama campaign app that was mining Big Data to find undecided voters in key states. But not only undecided voters. Dashboard can key in, find and persuade “Off the Grid” voters. Off the Grid is the term used to describe those people, such as students and other young people, with constantly changing locations and only a mobile phone. These voters have historically been virtually impossible to reach. This short PBS Newshour video below speaks volumes about the extraordinary impact and value of Big Data, not seen before.
The campaign’s hiring of Rayid Ghani, as “chief data scientist,” and an army of data analysts, set the stage for what was to come. On election night, Mitt Romney and Paul Ryan were absolutely convinced that they had won the election, but were shocked to find otherwise. Working through their disbelief, both candidates later remarked about the enormous voter turnout for Democrats in key locations and the “technology advantage” of the Obama campaign.
So from my years of observation of the march of technology and its impact on my own life, I am convinced that we are entering another transformational period as profound as the emergence of the Internet itself.
I have been repeatedly drawn back to Steve Job’s 2005 Stanford University commencement address, in which he closes with references to Stuart Brand and The Whole Earth Catalog. Stuart Brand is an extraordinary futurist. One of Ken Kesey’s original Merry Prankster’s chronicled in Tom Wolfe’s book non-fiction novel, The Electric Kool-Aid Acid Test, Brand had been inspired by the legendary first photograph of the entire Earth taken by Apollo 8 astronaut Frank Borman. Brand is also the founder of The Well, the very early Sausalito-based Internet Service Provider, who is now considered one of the most important thinkers on human culture, technology and its impacts. Word of Job’s commencement address spread virally around the Valley...”Did you hear what Job’s said at Stanford today?” Steve was basically saying that he too understood what McLuhan had said, and that Stuart Brand also understood the transformational importance of the Global Village, by publishing The Whole Earth Catalog.