Podcast: Beating Data Latency and Building Data Centers for Modern Applications
By: Blair Felter on November 19, 2015
Today speed is more than a competitive advantage, it's a best practice. People expect stuff quickly, and this new reality is impacting millions and trillions of dollars of business for your customers and their customers. Today's podcast discusses how a modern distributive data center infrastructure can address data latency issues. Joining us is John Panzica, Senior Vice President Sales at vXchnge, Deepak Aher, SAP Startup Focus, Global Go-to-Market and Market Enablement at SAP and Kyle Ackerman, Marketing Manager at Approyo.
Ellis: Welcome to the vXchnge podcast. I'm your host, Ellis Booker. Today's topic is Beating Latency, Data Centers for Modern Applications. We're going to consider why latency can be a scourge of cloud-based businesses, and how a modern distributive data center infrastructure can address that. Joining us is John Panzica, Senior Vice President Sales at vXchnge. Hello, John.
John: Hi, Ellis. How are you today?
Ellis: Also joining us is Kyle Ackerman, Marketing Manager at Approyo. Hello, Kyle.
Kyle: Hello. Thanks for having me.
Ellis: Finally we have Deepak Aher, SAP Startup Focus, Global Go-to-Market and Market Enablement at SAP. Hello, Deepak.
Deepak: Hello, Ellis, and hello everybody.
Ellis: John, let's start with you. Why is latency an important topic when we're talking about things like big data?
John: Thank you, Ellis. So latency or as some people look at it as speed is really important when you're talking about gigabytes and petabytes actually of data. They have to be sorted through special units on structured data.
And there are a lot of components to the latency that factor into the technology chain, if you will. Looking at geographic proximity is one. So being able to regionalize where connections are, where end users are compared to the source of the data. Certainly, certainly it makes big factors. If you look across the United States, from coast to coast, for example, it's about 80 to 90 milliseconds to traverse cross-country on a data network. That is just an enormous amount of time when you're talking about petabytes of data that you have to analyze.
So the more you could regionalize that into perhaps an East Coast and a West Coast scenario, or even get down sometimes in a location in a data center, geographic proximity becomes really important.
Whereas if you're in let's say a region, it could be down around 3, 4, 10 milliseconds, if you actually take that and go inside a data center, you can actually go down into the microseconds, which is really significant. There's 1,000 microseconds inside of a millisecond, you're actually able to condense that down to a couple hundred microseconds inside a data center. And the faster you're able to process the data, the more efficient you're going to be able to get meaningful information back to an end user.
Ellis: And so this is a question for the group. The kind of applications that require that kind of sub-millisecond, microsecond sort of speed, we think of financial services, trading for example. But there are other things coming online that require that as well. Anyone have a thought about what some of the modern applications that are going to need that kind of speed and absence of latency?
Deepak: I can take that. So what we have seen traditionally is that, and SAP being essentially one of the business-critical applications in most of the enterprises, what we're seeing is that earlier, the ERP-based applications or applications which ran businesses for our customers were essentially focused on batch processing. You put something today, two days later, three days later, the systems would essentially get updated. So those were like systems of record.
But that is changing today and they are now being transformed into systems of innovation. Which means it's more real-time, it's more now, it's about taking decisions at this particular moment which will impact millions and trillions of dollars of business for our customers and their customers.
That is what essentially is where companies like ours, SAP, is trying to bring that particular evolution in our customer segment. And we're doing that with our flagship product called SAP HANA, which is an in-memory computing platform today, which basically provides about 10x to about 1000x difference in terms of speed that it's bringing to different applications.
And these today, the applications are not these applications which essentially you start in the morning, you go in the afternoon, in the evening the applications are still running. These applications are about essentially taking data that's coming from a geo-spatial source, a data that is coming from a financial transaction, a data that is coming from a predictive algorithm which is predicting a yield of a specific agricultural product or a manufacturing outflow, or the speed would be essentially saying that I, before I actually get this consumer to use or swipe the credit card, how soon do I need to know that this consumer is not a risk to my business. So if you're talking about that, it comes into millisecond and nanoseconds to real-time information.
And today, all these applications are written by large ISVs and small ISVs. But in the last couple of years, as you would've seen, there has been hotbeds of startups or basically we call them as a "new ecosystem" that is being built, who are agile, flexible and are extremely innovative in their ways. They are the ones who are pushing the limits, they are the ones who are basically creating new business models and economies for companies like us to basically leverage on those applications. And to leverage those applications, you need a super solid latent system. So that's essentially my comment on the question that you asked.
Ellis: Sure. And then you didn't mention it, but we've talked about this previously, you and I Deepak, the Internet of Things, health informatics, knowing a patient is starting to have a heart attack, you don't want your application to realize that a week later or even a day later, or even an hour later. You need to know in real-time. Right?
Ellis: You mentioned a second ago SAP's own in-memory database for big data, HANA. For those who might not be that familiar, can you tell us a little bit about SAP HANA and who's using it, how they're using it?
Deepak: Sure. So as you know that SAP has been traditionally an application, enterprise application software company. So essentially we run businesses of our customers for all these years, essentially 42, 43 years, going into 43 years now. And we have helped customers to transform their businesses - from manufacturing, automotive industry and aerospace and defense. And then now, strategic industries like retail and consumer product industries, life sciences and things like that.
There was a need for us to create a platform of choice which can essentially accommodate the changes that we bring to our customers, and more important is the demand that our customers are putting in. It was extremely important for us to really come up with this amazing in-memory database and an application platform, which essentially is faster, and not I wouldn't say faster, but essentially about 1000 times faster than the regular databases that are available in the market, in the current hardware scenario.
It was also about simplification of design and operation because a lot of times, you can have a faster database, but if it doesn't really simplify your operation, you're going to end up having a challenge.
The third important concept was about real-time business application. Not all businesses require this real-time, but today the companies are moving into real-time. It is the demand of the market for them to move into real-time. Hence, there was a need to basically have multiple application servers having huge amount of operational data storages, data map, complex business intelligence tools, and all of these basically combined into one.
And that's where essentially there was a need for a re-invention of a database. And that's what SAP basically took on, saying that should we look at a platform or should we look at creating a new database? And the answer for it was we've created an in-memory computing platform, which has the database capabilities that are available, but what it really offers today is beyond database storage, or beyond this so-called database parameters that you would look at.
So just to put it in context, a significant amount of businesses today, about I think 70,000-plus transactions today on financial streets are essentially on this kind of a platform, in-memory computing SAPs, a platform like HANA. And a lot of them of them or 70% of the world's, I would say, chocolate is processed on these kind of...is chocolate a real-time need? No. But then when you look at financial transactions, when you talk about connected networks on Arriba's business environment, those are the ones which are very business-critical in nature.
For example, moving a supply chain of an automotive or aircraft manufacturing part niche, if there is a tsunami that hits a particular, let's say, a location in a manufacturing hub, let's say for example in India, how soon can you predict it? And if you do predict it, how soon can you move an entire manufacturing unit to a different location, which will not held up your other manufacturing priorities or KPIs that you have?
So that's essentially what HANA was essentially brought into existence. That's where HANA is essentially today playing. And a lot of customers, I would say we have about 250-odd strategic customers who are in the Fortune 1000 companies, who are essentially using this, and it is running multiple applications on those.
Ellis: So John, moving back to you as a provider of data center services, what are you seeing from 100,000 feet when it comes to customers moving to this real-time in-memory type database, HANA or some of the alternative systems?
John: Sure. Yeah, there's many use cases and instances that we're seeing across many different industries. Before I talk about that, I wanted to comment on the speed. So you talked about latency, we talked about the speed, then what HANA is processing in-memory, and all of the processes that are being collected from a time frame standpoint. That's the throughput component.
So it's the combination of the latency that's geographic and network-based that goes all the way down to your hardware and your in-memory, and then into the database portions as well. So those are the things that are really allowing from a use case standpoint.
I was talking to a pharmaceutical this morning, actually we had in our offices, and they were talking about doing extensive DNA studies in the doctors' offices to be able to do predictive analytics on whether you'll have cancer and what type of cancer and what year you would have it. The norm has been over the years that it will take you a week, or it started out much longer, a couple of months to analyze that data, then down to weeks, then over the past few years has been like a two-day window.
This pharma company was saying they'll have it back in near real-time. So within minutes, they're going to have the data back, analyzing your complete DNA makeup and being able to tell you with a certain degree of positivity whether you're going to have cancer and the types that you have or you're predisposed to, and your risks are associated with it. So that was just a very, very powerful use case that I'm sure resonates across a lot of folks. So it's things like that, things that change people's lives, all the way to more fun things.
And you and I spoke about it in the past, a use case with the NHL and SAP and being able to have sports fans all over the world do real-time analysis on hockey players and hockey games and scoring statistics. So you see the sports fan being affected by this as well in a really positive way where they're getting personalized data that would come back to them right to a mobile device, no matter where they're sitting in the world.
So it's fun things like that, or to more serious things like the pharma example, it's really changing industry after industry.
Ellis: John, before we leave you for this question, I do want to ask, as a provider of data center services, how are you guys managing a distributed environment where you might have a data center on one part of the country, possibly one part of the world, and an application, again, one part of the country, one part of the world, that require this low latency stuff. That seems like a new, very difficult architectural problem to solve. So how does vXchnge for example, address that?
John: Yeah, it's certainly a newer trend in big data analytics and SAP are certainly a part of driving that trend. And when you talk about data being distributed like that to end users that could be anywhere in the world or in any big city or small city, the content is there. And we're talking about real-time, we're talking about milliseconds, microseconds, and some cases nanoseconds. You have to have infrastructure in the right places to achieve those kinds of goals, to provide that kind of user experience.
Somebody who has an iPhone in their hand and they're interacting with an Uber application, for example, or they're doing just a simple web browser session and they're sitting in a small city in, let's say, Pittsburgh or in Philadelphia. In years past, you would serve that market from a Cleveland or a New York or a D.C., and you would do kind of a hub-and-spoke architecture.
And for many years infrastructure was set up that way, whether you're talking about telecom networks and dark fiber routes, where you're talking about content distribution networks, all of that infrastructure has been disseminated to what we call, "the edge" where you're actually taking that infrastructure and you're putting it locally. So now your latency, your throughput will go significantly down, and your user will have a much more positive experience with their application or the data that they're trying to achieve.
And that's something that vXchnge has been very good at. We have 15 data centers across the U.S now, and we're strategically building based on that trend of being able to provide infrastructure in some of the smaller markets, if you will, that demand for that infrastructures there because of the content that it needs to be there, because the big data analytics they need to be delivered, because of the real-time nature of the data that must be there.
If you're sitting in a doctor's office in Nashville, you want to have the same responses if you're sitting in New York City, and that all happens through technology infrastructure and being able to have that in the right place.
Ellis: I want to bring Kyle...oh, Deepak, just a second, I want to bring Kyle in for a second. Kyle, so from Approyo's perspective, have you seen any of these cool innovative applications that, again, require a low latency and these new database architectures, these new data center and networking architectures? Any cool apps that you're aware of?
Kyle: Sure. And I think that the big thing that we've been talking about is it's really across all industries. People expect stuff quickly now. Whether you're in New York City, whether you're in Milwaukee, Wisconsin, whether you're in Los Angeles or even other country, people just really expect their apps and information almost instantaneously, and having that kind of infrastructure setup is really, really important.
And again, another point to make is that it's all in history, is we have another use case scenario where we have a client that's in the transportation industry, and they need to know where their drivers are, what the best routes are. And that's something that's pretty fascinating, in my opinion, to take a look, get information from your map programs, your Google Maps, cross-check that with the routes that are being taken to make sure it's the efficient use of time, of gas, for cost-saving perspective.
Ellis: Good. Deepak, you have something to add, I think?
Deepak: Yeah, talking about the amount of scale and the latency that we're talking about, the speed, the aspect of speed that we talked about, the examples of customers that we see today are huge, and very specifically actually calling out to what John and Kyle actually mentioned about is customers of SAP were essentially running a query, just a query execution in their ERP. It was taking on an average about 112.86 seconds.
Guess what? Even that was a hindrance to a lot of their business units. They used HANA and they brought it down to 7.07 seconds. So there was a 93.7% execution, faster query execution model which was created. Why is it important? Because these guys are the guys who are basically building aerospace and defense products. For them, 112 seconds for multimillion parts that goes in a jet engine, 7 seconds is definitely an efficient, effective way of them if you're looking at it.
Also, from a HANA perspective, compression, data compression has been a huge amount of...data is growing every day by day. But when you have so much of data and, when you are querying systems to identify the relevant results from a use case perspective, large amounts of used terabytes of data has to be consumed in a way that it can be analyzed for whatever analysis you're doing. Hence, the database like HANA essentially has a database compression capability.
The load times I'm talking about, how much time does it take to load the data into your infrastructure or your application in my mind. And then most importantly is the cash execution. How much of a cash can your data essentially rebuilt on? And people have seen performances from 1.35 seconds to 2 seconds or 2.5 seconds cashing execution aspect of it. This is the need of the hour as we go into conquering new businesses and new trends.
Ellis: And that will have to be the last answer because we're out of time. Thanks all for participating in this edition of the vXchange podcast. If you would like to hear other podcasts, suggest show topics, or be a guest yourself, please visit vXchnge.com, and get in touch with us. Thanks for listening.
About Blair Felter
As the Marketing Director at vXchnge, Blair is responsible for managing every aspect of the growth marketing objective and inbound strategy to grow the brand. Her passion is to find the topics that generate the most conversations.