“A man should look for what is, and not for what he thinks it should be. Information is not knowledge.” – Albert Einstein
In the world of Big Data, insight—not information—is power. Visibility into what customers and constituents want and why they want it can help organizations create effective strategies that satisfy and provide value on the front end, and reduce waste and eliminate costly guess work on the back end.
But incredibly large and growing data volumes are challenging both the technical and human resources for many organizations. Some researchers estimate that 2.7 zettabytes of information is stored digitally around the globe—a volume that grew by 48 percent from 2011. Others say the amount of global data is doubling every 18 to 24 months. There’s no question that the digital universe is rapidly expanding and the ability to make sense of that universe, or at least your corner of it, is getting increasingly difficult.
[Get a 10% discount on ARM TechCon 2012 conference passes by using promo code EDIT. Click here to learn about the show and register.]
What’s needed is a renewed focus on the place where this information resides. The ground zero, if you will, of Big Data—the storage infrastructure. What’s needed is a smarter approach to storage that tightens up the infrastructure and transforms it into a strategic business post rather than a data dumping ground.
A smarter approach to storage is one that leverages technologies that are designed to improve efficiencies and better manage storage sprawl; technologies that are self-optimizing and require little if any human interaction; and finally, technologies hat are easily virtualized to take advantage of the cloud for greater flexibility, scalability and cost savings.
Ignoring the fact that there are more people alive today than the total of all humans ever living on planet Earth ever, and of course that means more data ...
Whenever I read about "big data," I can't help but think that there's always been "big data." The only difference is, since this "big data" was not stored electronically, nor was the vast majority of it obsessively kept in safe storage, no one ever worried about accessing most of it.
I mean, did people always save all their personal letters before? Not me, for sure. And yet now, if your personal letters sit in your hard drive, most people probably feel compelled to move them to long term storage, along with all their other files, for safekeeping. In case their hard drive crashes. How many people go through every single file, to see whether it makes sense to keep it?
And once you have these stored electronically, you feel you should be able to find anything you're looking for, even though no one would have obsessed over this previously.
The "personal file" example is, of course, just an example. All you have to do is look at your typical enterprise shared drive to see that there is a huge amount of "who cares" material in there. You know, a purchase order from 20 years ago. That presentation you never actually gave, about stuff that is totally obsolete today anyway. Back when, when you moved your office, most of that stuff had to be tossed. Now, instead, it gets meticulously saved in some long term storage facility.
Not saying this is bad, not saying we shouldn't be looking to ways to sort through all this stuff, but what I am saying is, it's really not a new problem. It's a problem that always existed, only now we're worrying about it.
Big Data is more than a trend, it is a new way forward. It allows organizations and governments to operate more efficiently than ever before. It allows them to make better predictions through better analysis. None of this replaces the human. We will still be needed to make the risky call based on human intuition that a computer just can't make. But Big Data allows us to be wiser and better prepared as we make those risky calls.
As Brian pointed out at the core of Big Data is the storage infrastructure. Big Data will not be a one-size fits all infrastructure but a cast of storage components all tuned to perform a certain function. Unlike other initiatives that storage supports Big Data requires "everything" capacity, performance and economical long term retention.
Like a symphony what is needed is a conductor to manage data flow and bring order. It should also automate as much of the work as possible so that human intervention is kept to a minimum. This requires a company with a broad portfolio of storage products and a history of automation through analytics.
In his book "Reinventing Discovery" Michael Nielsen defines data-driven intelligence as the ability of computers to extract meaning from data. Now he compares data-driven intelligence to human and artificial intelligence, but for our purposes we can contrast it to other types of IT software intelligences.
Among those are application-driven intelligence which codify business processes such as ERP and financial applications that are the bedrock of modern IT, infrastructure intelligence, such as operating systems and middleware, and communicating intelligence, such as e-mail, tweeting, texting, and FaceBook. All are vital and growing, but the one that is now attracting our intention more and more is data-driven intelligence.
Data-driven intelligence applications, of which Big Data is a focal point,are created and managed to fit the needs of the data which may be (and likely are) independent of the application that created the data. No, this is not new, but the growth rate and the value of the analyses that are associated with the data surely are.
And where there is data there is storage. Managing that storage for performance, affordability, and as Brian points out most importantly insight is going to become more critical to all enterprises, both public and private. That presents a challenge to storage in a number of ways as storage is inextricably intertwined with data-driven intelligence . Those that make the connection and do it right will reap the benefits. Those locked into a price per GB mentality will not. It should be fun watching what happens and who the winners and losers are.
David Patterson, known for his pioneering research that led to RAID, clusters and more, is part of a team at UC Berkeley that recently made its RISC-V processor architecture an open source hardware offering. We talk with Patterson and one of his colleagues behind the effort about the opportunities they see, what new kinds of designs they hope to enable and what it means for today’s commercial processor giants such as Intel, ARM and Imagination Technologies.