Three years ago, we learned that a BakerTweet could automagically alert customers that fresh bread or cookies were about to come out of the oven. Today, there are more than 2 billion tweets a week, many of them from bots or machines, not people. And there are 2.7 billion likes on Facebook every day, many of them automagically turned into ads that generate clicks … and more data.
Yet this data deluge is merely the tip of the iceberg, according to a new study from the Pew Research Center. We take for granted that there are chips tracking manufacturing processes and product distribution.
But there are also chips churning out data in our cars, our keys and our casino chips. Farmers track the number of insects in their fields using an online application, learning when, where and how much insecticide to apply. Scientists have sensors in the soil, oceans and air. And then there’s astronomy, space data.
IBM says that “90% of the data in the world today has been created in the last two years alone.”
Welcome to the era of Big Data.
Who will control all this data, Pew Researchers ask. How and where will we store it? Will the potential good — efficiency, transparency — outweigh the bad, such as loss of privacy?
Parsing Big Data
But what is Big Data, exactly? The report doesn’t define the term. Edd Dumbill at O’Reilly Radar calls it “data that exceeds the processing capacity of conventional database systems.” Conventional systems, of course, are contextual, tied to technology at a point in time. That might explain why Science magazine was writing about Big Data in 1998. Or why John R. Mashey, chief scientist at Silicon Graphics, gave a talk on Big Data and InfraStress in 1999.
The Pew Internet/Elon University survey of 1,021 Internet experts, observers and stakeholders explores two scenarios for 2020: one where the good trumps the bad and one where Big Data causes more problems than it solves.
Slightly more than half (53%) of those participating in the non-random, opt-in, online survey agreed with the complex scenario that ended thusly: “Overall, the rise of Big Data is a huge positive for society in nearly all respects.” Conversely, 39% agreed with an alternative scenario that concluded that Big Data “is a big negative for society in nearly all respects.” The other 8% didn’t make a choice.
Many respondents who came down on the “good” side of the divide acknowledged that their answers might be more “hope” than “prediction.”
In other words, the experts and those interested in the topic are split pretty much down the middle.
“The Internet magnifies the good, bad, and ugly of everyday life,” said danah boyd, senior researcher for Microsoft Research. “Of course these things will be used for good. And of course they’ll be used for bad and ugly… But that dichotomy gets us nowhere. What will be interesting is how social dynamics, economic exchange, and information access are inflected in new ways that open up possibilities that we cannot yet imagine.”
As with earlier communication and information technologies — from the telegraph to the telephone, from the photocopy machine to the computer — the path forward is not well marked and paved with fresh asphalt.
Alex Halavais, vice president of the Association of Internet Researchers and author of Search Engine Society, wrote, “The real power of ‘Big Data’ will come depending largely on the degree to which it is held in private hands or openly available. Openly available data, and widespread tools for manipulating it, will create new ways of understanding and governing ourselves as individuals and as societies.”
In the report, researchers Janna Quitney Anderson, Elon University, and Lee Rainie, Pew Research Center’s Internet & American Life Project, present benign customer-centric examples of Big Data today, such Amazon and Netflix recommendation systems, Google’s search query auto-complete and banking system phone calls inquiring about “unusual activity” on a credit card.
Although there are a few mentions of finance and more than a few mentions of algorithms, there is no mention of the role of information technology, algorithms, big data and flawed assumptions that led to the 2008 financial crisis in the housing market.
In many ways, there is little new here. Technology itself does not cause social change. That comes from the decisions we make or abdicate.
From Kevin Novak, co-chair for the eGov Working Group of the World Wide Web Consortium: “The tools, methods, and technologies will be the challenge in 2020, not the availability of the data itself. Society will continue to struggle with privacy… How we respond and manage should continue to be a major focus in the Internet community through 2020. We must understand the challenges and opportunities, know the gaps that exist, and offer the best chances for addressing these.”
Hype or hope?
As you read the report, it’s worth asking yourself: where is Big Data in Gartner’s hype cycle?
This era of Big Data, is it real or is it a diversion? SAS CEO Jim Goodnight suggested the later when he told InformationWeek, “We’re talking about big data now because everyone got tired of talking about the cloud.”
This latest report (pdf) from Pew Research Center’s Internet & American Life Project and The Imagining the Internet Center won’t answer that question.
What it will do, if you take the time to read it (it’s 41 pages), is provide you with points to ponder. It will remind you that privacy issues related to data should not be left to corporations nor should privacy rights protection be a lonesome ledge for advocates. It might introduce you to new thought leaders (although it also contains all of the usual suspects).
And it should forever remind you that where there’s hype, there may be hope. But it’s not guaranteed.