Building Organizational Trust for Cloud Optimization Software

calendar September 12, 2023

Getting Developers to Take Action

TLDR; tune the recommendation engine and people will be willing to optimize.

You’ve loaded data into Densify and after reviewing recommendations of the cloud optimization software your developers don’t want to take the recommended action – they aren’t trusting, yet.

This is one of the top issues identified by the FinOps Foundation and many of our customers of Densify – but not all. Let’s explore what some customers do differently to build organizational trust in taking recommendations of cloud optimization software.

One of the keys to success is appreciating that not every application in your company should be getting the same treatment. Some apps have special rules, while some do not. Some apps can run with high utilization than others where you might want to be more conservative.

Optimization recommendations, especially where automation is being leveraged, must take those nuances into consideration, otherwise trust can’t be gained and your optimization initiatives will flounder.

Optimization Recommendations That Consider Business Nuances

To differentiate and provide trustworthy recommendations for developers to leverage, companies can elect to tune the Densify engine.

The following are but a few tuning approaches our customers leverage to drive organizational trust leading to developers implementing recommendations – sometimes manually, and sometimes through automation frameworks that enable IaC such as Terraform and CloudFormation.

How much data is considered?

Graph showing workload usage patterns

Figure 1, Understanding every workloads long term usage pattern

Flexibility around the amount of historical data considered to generate a recommendation is a popular question. Workload history is a tunable feature that controls the number of days of data needed for analysis and can be different for increases vs. decreases in resource allocations – it is normal for customers to want to consider 30, 60 or even 90 days of historical usage.

Prioritize Risk or Cost Savings?

When you start feeding data into Densify, by default the engine will produce conservative recommendations. The default settings are the result of years of research with our customers.

There are usually two broad categories of adjustments to this conservativeness – either the customer will prioritize risk mitigation, or they’ll prioritize cost savings. You hear more about cost savings these days but it is quite typical for any business to have some mix across different applications. Some apps are mission critical – at any cost, while other apps can be constrained and run with less cost. Settings around how many cores, amount of memory and I/O capabilities can be adjusted to ensure the needs of both types of optimization are considered rather than a broad brush approach which tends to not build organizational trust.

“What about memory usage?”

Users often just getting started with Optimization will be concerned with the fact that most Cloud Service Providers (CSP) don’t provide memory utilization data by default. There are two approaches to tuning for this issue.

The first approach to dealing with a lack of memory data is called “Backfill” where we can lock the amount of configured memory for an instance and thus size only on CPU. The result is an optimization recommendation that’s a step in the right direction but not ideal. For ideal, you need memory data. To do that, you can either setup memory data with your CSP or you can alternatively import data via a third-party tool such as Datadog.

Graph showing lower usage with real memory data

Figure 2, With real memory data we can see the usage is much lower

“Restricting the CSP Catalog”

The available instance types that developers can pick from in any CSP is a big complex catalog. Sometimes business or architectural decisions result in the removal of certain options for example a customer might elect to not use burstable instances. The engine can be tuned to ignore instance types as needed.

“Play by your own rules”

Developers often struggle with making decisions around what instance type to choose for a given deployment. Some patterns we see are; simply choosing what you chose last time you deployed, or perhaps going with something that you heard about in the media that’s newly available from your CSP. In the ideal case, developers go to the extent of conducting non-pro performance/scalability testing of their app on different instance types (hopefully under load the simulates workload volumes that would represent reality after deployment). But testing of that nature is hard, costly and time consuming, so it’s rare. So, developers make a decision and deploy. Did they get it right?

The engine has built-in rules around how to pick from the plethora of CSP instance options automatically and at scale so developers can focus on functional requirements that provide business value. That said, if you have unique rules that need to be consider you can define new rules to govern instance selection.

Two simple examples of custom rules might be;

  1. Software ISV Certification such as SAP Certified Instance Types
    Table showing SAP rule sets
    Figure 3, SAP certified instance rules
  2. Restricting the ratio of CPU cores to memory of your instances ie: a maximum of 4 cores and 18GB of memory or perhaps a minimum of 4 cores and 8GB of memory

“Cost and Usage Attribution”

Code with tagging errors

Figure 4, Who’s the owner?

A popular challenge around getting actioning done is getting recommendations into the hands of the person responsible for setting the instance type for a given app. You need to know who deployed it! If that meta data is not available, then the optimization opportunity falls into the “unknown” category where no one knows who can act on it – sadly this is a popular category for most businesses. Tagging should be automated, not left to humans. Ideally tagging exists that specifies the owner for every instance you’re running, but that is not the norm in the industry. If you have meta data in another CMDB source beyond the CSP that too can be imported and used to attribute instance ownership as well as costs, and thus support getting optimization recommendations into the hands of someone who can do something with them.

“The costs should consider our CSP discount”

It’s typically for larger organizations to negotiate discounts with CSPs well beyond RIs/Savings Plans, Committed Use Discounts and other usage- based savings schemes. The engine can be tuned to factor in a discount across your usage so that the financial data is more representative of what you might see in your CSP bill.

“But we’re unique, can you do XYZ?”

Yes we can! Our Professional Services team can customize the engine at any step along the way from data collection through to distribution of optimization recommendations. One simple example of this that helps build trust is when groups if systems that need to all have the same instance type. This is usually the case for Disaster Recovery system groups. Customers will tag the systems in a group or define a naming convention, then when the engine decides that a better option is available every instance in the group will receive a recommendation that is consistent.

Tune for Trust

An optimization engine must provide actionable recommendations for optimization that are tunable to consider the unique nuances of each application. At the same time developers don’t have time to spend scrutinizing the seemingly infinite CSP deployment options – they’re busy building business value.

The cloud optimization software you choose for your business must have the ability to be tuned as needed to win the trust of developers enabling them to automate the instance selection process.

We’d love to chat with you to understand your situation, and give you some tips while demonstrating what we can do to help your organization free up resources. Connect with us for a short exploratory conversation and demo.

Interested in seeing what Densify can do for your environment? Get a demo »

For more information on the importance of optimizing your tags, watch a 20-min session (demo included) on Cloud & Kubernetes Tagging Best Practices.