I’m not surprising anyone by mentioning that we are in an era of unprecedented data creation which, in turn, creates an equally unprecedented need for storage. Long gone are the days of megabyte-based capacities; once-large gigabyte ranges are being replaced by multi-terabyte size drives, and even those are being superseded by massive petabyte drives in many cases. Exabyte drives won’t be far off as mobile data and video usage continues to climb.
Add to that the various types of storage, including local, network, cloud, and increasingly popular flash arrays for super-fast access to data. But, that dynamic range of storage options can create challenges for businesses, which need to understand how to allocate storage resources in a way that is both operationally efficient and fiscally mature. And, when you factor in dynamic usage patterns and changing SLAs that have to be met, the challenge becomes not only how to initially allocate capacity, but then how to effectively move data between storage facilities as needed.
“What typically happens is people overprovision so they can allocate storage and compute resources to meet each of those requirements,” explains Lance Smith, CEO of Primary Data.
That’s not surprising, as the digital Band-Aid has long been the healing method of choice for IT departments – often only because that’s the quickest and cheapest option.
But, in an age of mobility, what becomes apparent is users and devices aren’t the only mobile players – there is an equal need for data mobility, to create an environment where data can be moved between locations as needed.
For Smith and the founders of Primary Data – along with chief scientist Steve Wozniak of Apple fame, whom the team brought over from Fusion-io (News - Alert) – this created the ultimate business plan since, while at Fusion-io, they helped develop and popularize high-performance flash storage. As they solved the challenge of faster data access with flash storage, they effectively created another in data mobility, which they then understood could be solved through data virtualization – the concept of placing data where it’s needed at any moment.
“Today’s applications require you to tell them exactly where data and files are located for them to work,” says Smith. “This is a manual process.”
For example, if you’ve told Microsoft (News - Alert) Word where to find files you have created, if you move them or change their file names, the software is no longer able to locate them without you identifying it again. Data virtualization enables files to be moved to the most efficient location at any time. When they are in high demand, they can be placed in flash storage arrays, and when the apps with which they are associated are not being used, they may be stored in the cloud.
Data virtualization follows the same concept as virtualization of any other resources – it allows businesses to maximize their storage resources and use them to capacity, rather than having to over-provision and over-pay.
I look at it as a sort of valet parking for your data, where the Data Director (the metadata engine that defines each files location and uses) indicates whether a file can be parked in short-term parking (flash), overnight (local) or long-term (cloud), based on when it will need to be accessed next.
To be clear, though, data virtualization is very different from caching. Caching stores only the latest files but is not able to recognize what requires faster or immediate access – that information must be part of your data center economic model for maximum efficiency and value.
“Flash storage is expensive, so you have to ensure you are maximizing its use,” adds Smith. “With data virtualization, IT can do analytics to understand which apps are using which data and define data movement that happens on the fly based on policy or workload or whatever other criteria makes sense.”