In 1981, George Carlin first joked about needing a new house for all his stuff, buying a bigger house, and then getting more stuff. The development of the data storage industry mirrors this joke fairly closely. First, we needed more space to store data, then the amount of space we had, and the new ways to store it, encouraged the digitization of all kinds of data, necessitating even more space for all that data!
As a result, tech media has experienced periodic outbursts of panic about whether or not the world’s data centers have the storage capacity to handle all that information. As far back as 2007, this alarmist headline “Capacity Crisis: Data Centers Running Out of Space and Are on Power Overload,” demonstrated prescience regarding where many industries were headed in terms of data usage and, thus, data storage. The article warns that data centers would, no doubt, face a “time of crisis” and would soon encounter “more and more instances of downtime and failure.”
These warnings have yet to come to fruition. Not only has the data storage industry met the challenges before it, but with investment in development, both physical and logical, it has positioned itself to handle our exponentially growing needs for data transmission and data storage.
Are There Data Center Storage Capacity Constraints?
There’s no argument in the data world that there is, or at least will be, a need to continue data center and data storage expansion, but what that future looks like is unknown. And while currently the world’s data centers store about 1,327 exabytes of data, and we’re producing about 2.5 exabytes of data daily, we’re not in danger of running out of data storage space just yet.
In fact, data center issues related to capacity and room are far more limited by physical and facility issues than they are by logical ones, though those are not at capacity yet either. As research and development into data compression, quantum computing, and nanotechnology continue, capacity for storage will likely, at a minimum grow with demand, at best, far exceed it.
How Much Data is Actually Stored in Data Centers Worldwide?
The big three, Amazon, Google, and Facebook, each maintain massive data centers of their own, both in the U.S. and worldwide. At last official count, Google had 15 data centers around the world, Amazon 14, and Facebook 12 totaling 15 million square feet of space. These three tech behemoths obviously have storage needs that exceed most other companies, thus the need for their own expansive data centers. In fact, it’s important to note that of the data transmitted throughout the day, not all of it needs to be stored. So even if we’re looking at unfathomable numbers in terms of data transmission, storage is a different issue altogether.
As this industry continues to both drive and enable new applications, services, and needs for data, it positions itself to create its own need for growth. From what we have witnessed thus far one technological advancement (consider smart phones and their essential creation of the mobile application market) has the potential to shift the entire industry again. Given our reliance on data and how the need for reliable data storage and transmission will continue, the industry may only be limited by the availability of devices used to send and receive data.
Technological Improvements Enhancing Data Center Storage Capacity
As noted above, the primary challenges facing the data storage and data center industry right now have more to do with facility and space issues in the physical sense rather than the data. We haven’t reached capacity there, but if we consider Facebook’s use of 15 million square feet, the need for power and cooling on top of land use, and we start to see where the pain points might be in the industry.
Thankfully, one area that’s keeping pace with the industry itself are ways to store data that minimize those other challenges by changing data storage itself.
Improvements in Memory
For decades, hard disk drives (HDDs) were the foundation of data storage due to their reliability and relatively low cost. Although there is still room for innovation, HDDs are finally hitting their performance limits. Further improvements would require far too much power consumption to be practical.
A major source of growing data volume is the proliferation of Internet of Things (IoT) devices. Continuously connected to network infrastructures and constantly gathering data, IoT devices were expected to exceed 20 billion units by 2020. It appears, however, that number was hit much sooner. By the end of 2018, it’s estimated there were 22 billion units. These devices account for many of the exponential growth projections that are raising questions about global data storage capacity. Even the most conservative estimates expect IoT devices to generate dozens of zettabytes (1 zettabyte is roughly equal to 1 trillion gigabytes) worth of data annually within the next five years.
Fortunately, the very nature of IoT devices makes them a slightly less daunting problem than they might initially appear. Much of the data that these devices gather is processed locally, oftentimes with the device’s own computing power. In other cases, this data will be relayed to an edge data center rather than an enterprise-grade or hyperscale data center. Edge data centers may gather data, but they aren’t primarily used for storing it. Much of the information gathered by IoT devices is either redundant or non-essential and can easily be discarded. While edge computing architectures will require powerful analytics to determine what data needs to be retained and what can be marked for deletion, implementing these filtering measures will greatly diminish the pressure on data center capacity.
AI and Unstructured Data
Much of the data being generated today is considered unstructured and, within 5 years, that number could be up to 80% if estimates are accurate. Unstructured data is distinguished by its lack of any specific format. It can come in many sizes, shapes, and forms, making it a challenge to manage. This data can contain many valuable business insights, but finding those insights can be like searching for a needle in a haystack. Only about 10% of unstructured data is worth being saved for analysis.
Much of the data generated by IoT devices is unstructured, and so the same basic strategies can be deployed to deal with other forms of unstructured data. By using cognitive, AI-driven technology, companies are already finding ways to better interpret, evaluate, and derive insights from this data, making it easier to manage in the process. Just because data centers have exabytes worth of storage doesn’t mean every scrap of data needs to be preserved. Much of this information is redundant, irrelevant, or damaged, so any tools that make it possible to identify and discard “useless” data will prove invaluable in effective data storage and management.
Future Investments in Data Center Construction
Until we realize some of the advancements on the horizon, one solution remains intact: build more data centers!
In addition to investing in new memory technology and using analytics to separate the wheat from the chaff, companies are also taking steps to increase their overall data capacity, investing more than $18 billion in data center construction in the US alone. This figure doesn’t include plans to retrofit existing facilities to take advantage of the new storage technology and and best data practices. Over the next two to five years, multi-tenant data center revenue is expected to increase by 12%-14% per year. Much of that growth will come from expanded capacity.
In addition to highly agile edge data centers that are helping to realize the potential of IoT, massive hyperscale data centers are rapidly taking over the enterprise sector. The US is a leading player here as well, with about 44% of the world’s hyperscale facilities, far outpacing China (8%) and Germany (5%). Some estimates even predict that over 50% of all data traffic will pass through these massive data centers within the next few years. On the other end of the spectrum, many companies are experimenting with modular data centers that can be assembled onsite to place storage closer to end users and repurpose existing commercial space to take the pressure off network capacity.
How a Colocation Provider Can Help You Fend Off Capacity Concerns
With innovations in memory hardware and data center construction over the last decade, reports of the “data center overload” and fears over capacity seem to have been greatly exaggerated. While data storage will always be a critical concern for companies, the combination of new storage technology and more efficient memory management are poised to provide companies with all the tools they need to deal with the truly massive amounts of data being generated today and in the future.
One of the best ways to alleviate your concerns about storage is to work with a colocation provider, one who is poised to handle the changing needs of both your business and the data storage industry. With multiple data center locations, vXchnge is uniquely positioned to take advantage of edge computing capabilities and to welcome in some of the changes discussed here. If storage and capacity are concerns for your business, be in touch today!
About Ernest Sampera
Ernie Sampera is the Chief Marketing Officer at vXchnge. Ernie is responsible for product marketing, external & corporate communications and business development.