Is Data Fragmentation Holding Your Project Back?
Rawlinson Rivera is the Global Field Chief Technology Officer at Cohesity, Inc., and a well known as an industry thought leader on cloud enterprise architectures and hyper-converged infrastructures. Rawlinson joined Cohesity’s leadership team following 10 years at VMware, where he was most recently the Principal Architect working in the Office of CTO for the Storage and Availability Business Unit. Rawlinson has also authored numerous books based on VMware and Microsoft technologies, and is the author of the popular blog PunchingClouds at www.punchingclouds.com.
Data has been described as the new oil — but also as the new asbestos. We know that data, if managed and exploited in the right way, can be the ultimate digital asset powering a company to success. But, if not managed correctly, that same data can create dire consequences for enterprises around the world — including compliance penalties, inferior customer experiences and major competitive pitfalls.
In other words, what should be the most valuable resource driving business today — data — instead can easily become the biggest obstacle to digital transformation.
So why is it that a handful of well-known companies (with valuations to match) have disrupted entire industries by exploiting the power of data (think Facebook), while the majority are still struggling with basic tasks like making sure backups happen, SLA objectives are met, or eDiscovery requests get a response? Instead of viewing data as a competitive asset, why do most organizations treat it as a storage expense or an overly-complicated management problem that can lead to challenges with morale and turnover on the IT team?
The Mass Data Fragmentation Problem
To answer these questions, we have to look at the underlying issues that got us here in the first place, and one issue in particular: mass data fragmentation.
Mass data fragmentation refers to the vast and growing sprawl of data that are scattered across different locations, trapped in infrastructure silos, and buried unseen in long forgotten storage systems. The vast majority (perhaps 80 percent) of a company’s data is secondary data, meaning it is non-mission critical and has low service-level agreements.
Secondary data is stored in backups, file shares, archives, object stores, test and development systems, and private and public clouds. Much of this data consists of duplicate copies, or duplicates of duplicates (research shows cases in which companies have 11 or more copies of the same data!) and nearly all of it is dark data — meaning companies can’t locate it, protect it, or look at its contents.
While secondary data is just as important to the business as the data running in primary systems, it has become a serious and intractable problem because of mass data fragmentation.
IT Professionals on Mass Data Fragmentation: The Problem is Real
How much of a problem is mass data fragmentation? Real people in real organizations have confirmed the gravity of the problem, and it’s not pretty.
A recent global survey of 900 senior IT professionals revealed a strong majority — nearly 70 percent — believe their secondary data will increase anywhere from 25 percent to 100 percent or more by the end of 2019. And nearly nine out of 10 organizations believe secondary data is fragmented across silos and is, or will become, nearly impossible to manage long-term.
Copies are a major factor, with 63 percent saying they have accumulated between four and 15 copies of the same data across multiple silos. Of those respondents that store data in the public cloud, nearly three-quarters of them consciously make a redundant copy of public cloud data.
Organizations are not blind to the potential risks they face: Over 90 percent of respondents noted their concern about a lack of data visibility and the inherent compliance risks. Additionally, of those who believe their data is fragmented, almost half believe that mass data fragmentation will put them at a competitive disadvantage and that the customer experience will suffer.
To house their already-exploding data, some organizations are using 11 or more separate data solutions today, along with four or more public clouds in addition to their data centers.
To rephrase these findings: MOST organizations think their data is spiraling out of control.
Not surprisingly, there is a toll on the IT teams tasked with managing secondary data for the business. IT teams reported they are dedicating at least 30 percent (and up to 100 percent) of their time and effort managing all the complexity, and moving forward, it may take an additional 16 weeks of IT’s time each year if proper tools aren’t in place. If that happens, staff turnover could be an inevitable result: Nearly 40% fear massive turnover, while over a quarter said that they, or members of their team, may quit their jobs without the proper technology or budget to help.
We’re All in This Data Mess Together
If you identify with some — or all — of these findings, don’t feel guilty. It’s not really your fault. There has been almost no true innovation in the secondary data world in 25 years!
Fragmented data silos are a natural outcome of vendors producing single-purpose tools, each optimized for a specific function — like backup, file and object storage, and archiving — that were not designed to share data with other systems.
Adding to this data conundrum is the move to multi-cloud infrastructure. As we speak, a whole new generation of silos and data islands is being created across all the major public cloud environments.
So it’s not really surprising that IT is having a hard time meeting basic service level agreements. But there is a path forward.
Confronting Mass Data Fragmentation
Before companies will ever be able to use their data as a positive asset, they will need to confront the issue of mass data fragmentation head-on. This means finding a way to consolidate and simplify data infrastructure.
Nobody expects the shift to multicloud infrastructure to reverse itself — companies will not suddenly decide that they can solve all their data needs using one cloud or one data center. Instead, consolidation and simplification will require connecting different clouds and data centers on to one platform so that they can be managed as a whole — from one user interface. Companies should also be able to rely on this same platform to manage all secondary data workloads — backups, test/dev, archiving, file shares, object stores and data sets used for analytics.
By connecting infrastructure, workloads and storage locations, companies will realize enormous benefits in terms of reduced management complexity, more efficient storage usage, and equally as important, the ability to extract maximum value from all of your secondary data. Without a unified approach, mass data fragmentation will lead to greater and greater obstacles in IT. But taking the effort to simplify and consolidate secondary data makes it possible for companies to turn that data into a positive asset they can leverage for future success.
Feature image via Pixabay.