Auto-Scaling Cloud Applications: Practice is Harder Than Theory

calendar November 16, 2017

One of the greatest strengths of cloud computing is that many of its benefits are easy to conceptualize and articulate. That’s certainly true for the benefit of cloud elasticity and agility. Your cloud-based application suddenly needs more or less computing power or storage? Simply tap some additional cloud resources – or release unneeded capacity – and problem solved!

At a fundamental level, every cloud customer gains improved IT elasticity and agility. But fully exploiting the potential of these cloud characteristics requires automating the scale-up/scale-down process, rather than manually purchasing or cutting cloud instances on demand as your application needs fluctuate. As you might guess, implementing auto scaling requires a fair amount of planning and analysis.

using autoscaling creates true cloud elasticity

One obvious requirement for those seeking to implement auto scaling is the need to place upper and lower limits on how many resources – and how much expense – the cloud system can add or drop on its own. If a company sets the minimum too high, for example, they may cause instances to continue running – and costs to keep rising – even when they’re not needed.

Other aspects of auto scaling can be even trickier to navigate. The description of Auto Scaling Groups on the Amazon Web Services website ends with a seemingly innocuous statement that hints at one of the core challenges: “The better you understand your application, the more effective you can make your Auto Scaling architecture.”

Well, yes. Unfortunately, far too many organizations don’t really have in-depth insights into their applications, or into the infrastructure resources that their applications actually need and use. We discussed this lack of visibility in an earlier post that focused on the difficulty in right-sizing cloud instances.

Even with good application transparency, when it comes to the ease and efficiency of application scaling, a lot depends on the nature of the application itself. Scaling websites and web workloads can be relatively straightforward – if you get more hits than the existing cloud instances can handle, just add another instance or two.

By contrast, with enterprise applications and other general workloads, doing automatic scaling well can be much more difficult. An enterprise app may run for many hours and use processing power and memory in many complicated ways. As such, it can be difficult to know which infrastructure resources are being used, much less the utilization levels of those resources. Without that knowledge, it can be impossible to make intelligent – much less automated – scaling decisions.

Indeed, the difficulty in scaling monolithic “spaghetti code” applications is one of the forces driving the conversion of such apps to container-based microservices.

Narrow-function microservices can be turned on and off when needed, and it’s easier to ensure high infrastructure utilization rates with these granular elements. “It’s like putting sand into jars versus stones into jars,” explains Andrew Hillier, CTO of Densify. “More granular workloads are more fluid, and it is easier to stack them together – and to determine when you need more cloud instances.”

Regardless of an application’s design, exploiting the full potential of auto scaling still requires that organizations have good visibility into how their apps consume resources under different loads. Many scale groups that Densify has studied, for example, have high memory utilization, but low CPU utilization. If a company changed its cloud instances to “high memory” configurations, it might be able to run fewer of them, saving on the overall bill.

Ultimately, as with so many of cloud’s theoretical benefits, success in automatic scaling starts with first getting good analytics into how your applications run, and knowing what infrastructure they require.