When Selecting Cloud Instances is Mission: Impossible, Machine Learning is Your Secret Weapon

Published on: Sep 11, 2018

Good afternoon, TechOps. Your mission, should you choose to accept it, is to select the optimal cloud instance to meet the needs of every application hosted in your public cloud. Good luck. This message will self-destruct in five seconds.

Perfect cloud instance selection is a core IT Ops mission

No one said IT Operations would be easy. With responsibility for maintaining the uptime and performance of critical business applications, Ops teams are charged with finding the ideal balance between application demand and public cloud supply. Given the complexity of this challenge, this truly is Mission: Impossible. And a “fallout” is not an option.

Battling the Forces of Complexity

On the demand side, understanding and predicting application workload patterns and the resources needed to support them is humanly impossible. CPU, memory and I/O utilization is constantly changing. Trying to analyze reams of raw utilization data for a single application is immensely difficult; multiplying that effort across all hosted applications is just not feasible with manual methods. And as soon as you think you have it nailed down, those application patterns can change again.

Cloud instance selection is massively complex for TechOps

On the supply side, things get even more complex. Public cloud offerings are continually evolving and multiplying. At last count, there were 3.2 million ways to provision Amazon Web Services resources, with new offerings coming online constantly. Adding to the complexity are location restrictions, with some cloud instances only available in certain regions. Plus, there may be corporate contractual constraints that limit what cloud offerings are available to your organization. Add in the availability of Reserved Instances—offering potential savings in return for a commitment to purchase a specified cloud capacity for a specified period of time – and the choices are truly mind-boggling.

Let’s not forget that on the cost side, your boss has his guns out for you—on why this stuff costs so much and that you must do your job with the least amount of collateral damage. Who said cloud was supposed to be cheap? It is not, when you simply overprovision to make sure applications don’t have their own failures. This problem gets worse when you try to make RI commitments based on the configurations you currently have, without optimizing first. Your cloud cost efficiency will spiral downward, and eventually, impact the profit margin of your business.

So, what is the typical “fallout” of this Mission: Impossible? It usually means subscribing to suboptimal cloud services, leading to poor application performance, unnecessarily high cloud costs, or both. Unintentionally underprovisioning cloud resources can cause application performance problems or even outages that could force TechOps professionals to brush up their resumes. Overprovisioning to “cover your assets” is no guarantee that a spike in application demand won’t degrade application performance. And over time, higher than necessary cloud costs are sure to bring budget heat down on TechOps.

Machine Learning to the Rescue

To tackle this mission impossible, you don’t need Ethan Hunt and his IMF team; you need machine learning. It’s the only way to effectively match application demand to the optimal cloud instance among the millions of available offerings.

Machine learning can evaluate complex, dynamic application workload patterns over time to determine the optimal blend of CPU, memory and I/O resources required to ensure application performance, without over-provisioning. This analysis also takes into account business and functional requirements, policies (such as security and privacy policies), and other application-specific factors that impact hosting decisions—including whether or not the application should even be in the cloud.

Machine learning is needed to sift through the mountains of constantly-changing public cloud offerings to identify the right family, the right size, the right service, the right scaling option, and the even the right generation of instance to meet application requirements. This capability becomes even more crucial when selecting Reserved Instances or when optimizing containers shared by multiple applications.

Mission: Accomplished

Put simply, machine learning takes the guesswork and risk out of selecting cloud instances to ensure application performance and availability, while maximizing resource efficiency.

Ethan Hunt and his IMF team assume death-defying risks with every mission they take on. But TechOps shouldn’t have to. By tapping the power of machine learning to rationally and automatically match applications to the ideal cloud infrastructure, they can avoid application (and career) risk and turn Mission: Impossible into Mission: Accomplished.

Get Your Mission Briefing

Sign up for a personalized demo of the Densify cloud optimization service today and see how we can make public cloud instance selection Mission: Possible for your organization.

Stay tuned: In our next article, we’ll look at how machine learning helps tackle the special challenges posed when using Infrastructure as Code solutions to provision cloud-based applications.

Andrew Hillier CTO, Densify

Published on: Sep 11, 2018