ITSM Change Management to Control Continuous Cost Optimization

calendar July 17, 2020

The Challenges to Automating Cloud Cost Optimization

I’ve been writing about continuous cloud optimization for a while now, and recently, I’ve spoken with several organizations to understand any challenges they’re currently facing in their automation journey. Their insights would help us understand how we can improve our technology to better support them.

I discovered two fundamental themes behind their challenges:

  1. The inability to control which services within the environment are continuously optimized – Not all products are ready for continuous optimization, therefore some must continue to run without the influence of optimization
  2. The inability to control continuous optimization using internal ITSM processes – Without this control, the communication between various application stakeholders and the ability to audit, track, and trace the lifecycle of the optimization task becomes increasingly unmanageable

Continuous cloud optimization relies on the systemic reallocation of cloud supply to improve resource utilization, which ultimately leads to a significant reduction in cloud costs. We’re talking about implementing precise infrastructure changes to complex IT environments comprising of hundreds of thousands of running services being managed by large enterprises. Given the number of actors and moving parts, ad hoc processes to supplement the above themes simply doesn’t suffice. Totally understandable.

Tackling the Problem with ITSM Change Management

We can enable an ITSM change management process to control the implementation of an optimization task for continuously-optimizing products created and deployed through a service catalog. I say continuously-optimizing to differentiate between products that have been enabled for continuous optimization, verses products that have not.

It is worth noting that although the design paradigms driving this solution are generic in nature, what we have implemented to satisfy the continuous optimization challenges observed above makes use of the technologies described below. The solution integrates these components together to provide ITSM-controlled continuous optimization for IaaS infrastructure in an AWS environment.

ITSM-Governed Continuous Cloud Optimization Solution Components
AWS Service Catalog The technical service catalog where products are created by engineering staff
ServiceNow Two functions: 1. to host the catalog of continuously-optimizing products that can be ordered by application owners, and 2. to facilitate the change management process to control the optimization lifecycle
Densify The optimization service that will continuously analyze resource utilization to generate insights for more optimal supply allocations
AWS Systems Manager Parameter Store The parameter repository that will be used to hold new insights and track their lifecycle
AWS Systems Manager OpsCenter Work order management system where cloud operations can manage the execution of approved optimization tasks
AWS CloudFormation The infrastructure as code technology that will manage the low level deployment and update of the product’s underlying services

The Continuous Cloud Optimization Product Lifecycle

There are a number of processes that make up the product lifecycle in the context of this solution. We’re going to illustrate that these sets of processes can address the two themes we described above.

Stages in a cloud continuous optimization product lifecycle
ITSM-controlled continuous cloud optimization product lifecycle

1. Create a New Product

Engineering can create products within AWS Service Catalog that inherit a specialized tag. This tag will indicate that this product is continuously optimizing, satisfying the first theme. The product can now be published to ServiceNow facing our application owners.

2. Order Published Product

Application owners can now log into ServiceNow to order these newly published products. Product specific inputs are provided along with the order request. The request is issued to AWS Service Catalog after the order approvals are satisfied.

3. Product Fulfilment

AWS CloudFormation is invoked with the product spec to facilitate the deployment of the underlying services. Notifications are delivered upstream to indicate that the product is ready for use. Any issues during the fulfilment process results in a notification to cloud operations to initiate the support process.

4. Continuous Optimization

A set of processes will continuously ensure the appropriate supply allocation is assigned to running services with the specialized tag. ServiceNow change management processes are invoked to control the approval or denial of an optimization task, satisfying the requirement for the second theme.

5. Delete

Finally, when the product is no longer required, it can be decommissioned from the ITSM service catalog.

Continuous Cloud Cost Optimization Service

Continuous cloud optimization lifecycle
Continuous cloud optimization lifecycle

This figure describes the optimization lifecycle for a service that has been enabled for continuous optimization. The processes, Acquire Approval and Acquire Maintenance Window, are part of the ITSM collaboration phase, which gives direct control to both application owners and cloud operations of whether an optimization task is executed.

1. Resource Analysis

After the product has been running for a sufficient period of time, Densify can analyze the resource utilization to recommend more optimal supply allocations.

2. Parameter Repo Sync

New insights are delivered to a repository typically situated within the DevOps pipeline. We use AWS Parameter Store as our repository for this implementation.

3. Acquire Approval

The delivery of insights triggers the creation of a change management ticket opened in ServiceNow. Application owners can now review the approval request and choose to approve or deny. An approval continues the process and a denial cancels the optimization task.

4. Acquire Maintenance Window

Upon approval of an insight, a work order is issued to cloud operations via AWS Ops Center. From the work order, cloud operations can create a maintenance window to execute the requested work.

5. Update Product

When the maintenance window date/time arrives, a CloudFormation stack-update will be issued if the existing stack is in a stable state. The stability of the stack is validated through a drift detection. The resource update events are delivered to the Ops Center ticket for tracking. Upon successful update, notifications are sent to both Ops Center and ServiceNow. Failure to update results in a new Ops Center ticket for manual investigation.

Summary

This article covered a high-level process overview, describing how resource optimization can be implemented using an ITSM-controlled change management process. It was meant to give you a preview into how we have solved the challenges behind practical automation of infrastructure optimization. The solution that implements what was described here relies on a point source highly available architecture implemented within AWS, which I will delve deeply into in an upcoming article, ITSM-Controlled Continuous Cost Optimization using AWS, ServiceNow, & Densify.

If you are interested in seeing a demonstration of this capability, please consult the resources below, or request a 1:1 demonstration with one of our Cloud Advisors.

Request a Demo of ITSM-Driven Continuous Cloud Cost Optimization

Resources