Layers of Latency – Cloud Complexity and Performance
The cloud has enabled enterprises to dramatically improve how they operate their businesses, bringing information and applications to every corner of the globe and freeing up storage space as big data grows in popularity and volume. Within the cloud, users can access applications from literally anywhere in the world, requiring only an Internet connection, and applications can be housed across multiple data centers sprinkled around the globe. Because of the flexibility and availability that the cloud offers, more than 30 percent of enterprises worldwide use at least one cloud-based solution. What’s more, cloud revenue is expected to grow 500 percent from 2010 to 2020 as cloud applications and companies multiply and expand.
Despite the cloud opening so many possibilities, it is not always able to deliver on performance demands, sometimes leading to sub par end user experiences. For example, research from Google Chrome executive Mike Belshe found that 20 milliseconds of network latency can result in a 15 percent decrease in page load time. Other studies from Amazon and Google found that a half-second delay causes a 20 percent drop in traffic on Google, and a one tenth of a second delay can lower Amazon’s sales by 1 percent. Clearly, latency is not only a nuisance, but also a serious problem for enterprises that house their applications in the cloud.
To mitigate the growing effects of latency, it’s important to understand what causes it, as well as how enterprises can reduce it. With both the Internet and cloud computing playing a role in how we share and access applications, latency is far more complicated than one might suspect. Prior to the arrival of the Internet, latency was defined simply by the number of router hops required for data to travel from origin to destination. Enterprises, for the most part, owned their network and all its components. So, packets would have to travel the distance between two computers or servers, resulting in more latency for transfers that involved more hops.
Today, most networks are broken down into hundreds — if not thousands — of components that are each owned, operated and managed by different entities. Therefore, enterprises often do not have insight into the performance of their network, let alone the ability to optimize its performance or reduce latency. Often labeled as distributed computing, this scenario means that if even one server out of hundreds is experiencing latency, cloud application users will see slower load times and halted performance.
Compounding the effects of distributed computing, virtualization adds another layer of complexity to cloud latency. Once a simple storage warehouse for rack-mounted servers, today’s data centers are a complex web of hypervisors running dozens upon dozens of virtual machines. Within this forest of virtualized network infrastructure, servers often incur packet delays before data even leaves the rack itself.
Because these widespread and complex networks are increasingly common in today’s world, many connectivity providers now provide service level agreements, or SLAs, that outline a minimum level of service and guaranteed network performance. Service providers of all types, whether telecom or cloud, work very hard to uphold the minimums outlined within their SLA. However, when it comes to cloud transactions conducted over the Internet, service providers often don’t establish SLAs. This is largely because cloud latency is such a new phenomenon and connectivity providers are still working out how they can ensure strong uptime for cloud applications, not to mention what levels to set.
It’s clear that these three factors – intricate networks, virtualization and a lack of software SLA standards — create extremely unpredictable and unregulated service levels. However, the problem is not necessarily latency, but the unpredictability of it. To overcome this unpredictability, enterprises need to establish a baseline for performance and then keep as many cloud applications as possible performing to that level. Only then can they work to reduce it.
Many have found that establishing a direct connect to a public cloud is one way to help reduce the cloud’s unpredictability. These connections are offered by many leading cloud companies, including Amazon Web Services (AWS), to enable a connection between an enterprise’s network and the public cloud without involving hundreds of other servers or virtual machines. This essentially means that an enterprise can set up its own lane in which cloud applications travel back and forth to the home network. As a result, cloud traffic is no longer subject to the unpredictability of the general Internet and performance becomes far more calculable. Performance metrics for these services follow strict quality of service guarantees as these “cloud on-ramps” are seen as a key offering within cloud providers’ solutions.
Outages and latency continually remind us that the cloud is not perfect. Despite the high performance capabilities of the cloud, we must keep in mind that this is very new technology and both users and providers are still trying to work out the kinks to establish steady service levels. However, the first step to achieving a low-latency, high-performing cloud is identifying the causes of performance degradation. From there, we can only move onwards and upwards – into the cloud.
This article was originally published on WIRED (18-9-2012).