How to Eliminate Deadwood in AWS, Azure, & Google Cloud Platform Without Risking Disaster

calendar September 14, 2017

In this final segment of our ongoing series about right-sizing public cloud instances for optimal performance and cost efficiency, we are focusing on a common source of cloud overspending: deadwood.

Idle or zombie instances plague many public cloud environments, insidiously wasting your opex budget dollars. This deadwood can occur without you even realizing it, and is often the result of hasty deployments or a lack of accountability in the cloud world. Someone in the organization lights up an instance for a short-term use and then forgets to shut it down. Workloads change over time and no one goes back to eliminate the now idle instances. The fact is, most organizations don’t have an effective process for managing cloud instances and identifying idle instances. Given the complexity of invoicing from these providers and the lack of visibility into workload patterns, it is often hard to really know what is truly idle. Over time, the deadwood piles up.

Eliminating idle instances may seem like a no-brainer, but there is a potential risk. What if the instance is idle 90% of the time but then lights up for a short amount of time to handle a weekly, monthly, or quarterly workload, such as batch processing? Eliminating that instance could spell disaster — especially if it’s part of a mission-critical application.

How to eliminate public cloud deadwood instances

To avoid that risk, you need to look at the workload pattern across a full business cycle, using sufficient history to erase any doubt. Consider the example below. Looking at a 24-hour period, you could conclude that you’ve identified a deadwood instance. Pull the plug, right?

Looking at 24 hours of utilization data is not enough

Not so fast. Analyzing several weeks of workload activity paints a very different picture (see figure below). This workload is idle most of the time, but its activity peaks once a week, and this could be an important business process. So you may want to do some right-sizing, or consider scheduling the instance on and off, but it is clearly not deadwood.

Analyzing activity over weeks is a better approach

Identifying the true deadwood requires sophisticated analytics that examine workload patterns across a full business cycle—and look at all utilization factors, including CPU, I/O and memory. With this insight, you can make confident decisions about what instances can be terminated safely to save money, and which ones are better candidates for downsizing and/or modernizing for optimal efficiency. In the example below, we’ve identified five instances of deadwood that can be terminated, saving more than $850 a month, or more than $10,000 a year.

You can save money by terminating instances accurately

With the right analytics and true visibility of workload patterns, you can finally make confident decisions about what workloads should stay and which should go. This insight helps you establish a process for regularly reviewing your cloud instances to ensure you are making the best use of every opex dollar.

More in this Series

This post is part of a series of five articles on how to reduce public cloud costs by choosing the right instances from public cloud vendors such as AWS, Azure, IBM, and Google.