Elastic Kubernetes Service (EKS) is a managed Kubernetes platform that Amazon
Web Services (AWS) operates. Kubernetes is an open-source container orchestration platform
enabling users to automate all aspects of their containerized applications, including deployment and
rollback, storage, scaling, self-healing, and network routing.
EKS users benefit from AWS automatically managing their Kubernetes control planes, allowing them to focus
exclusively on their data planes (worker nodes, applications, and surrounding infrastructure).
Optimizing the costs for EKS requires understanding how various components of an EKS cluster are charged
and how the cluster configuration can be tweaked to mitigate unnecessary expenditures.
The minimum cost of any EKS cluster is $73 per month
($0.10 per hour). This cost is for the managed control plane created for every EKS cluster. On top of
this cost, users must provision other resources (such as EC2
instances) to run their clusters.
This article will discuss how users can effectively approach EKS cost optimization for all resources in
the cluster.
Optimizing worker node costs | Worker nodes are often the most expensive components in an EKS cluster, so correctly configuring them will have a significant impact on total cost of ownership. |
Rightsizing EC2 worker nodes | Worker nodes should be rightsized using metrics data to determine optimal EC2 instance types. |
Implementing spot instances | Spot instances can save up to 90% compared to on-demand EC2 instances; implementing this feature as part of a cost optimization strategy is essential. |
AWS reserved instances and savings plans | Reserved instances and savings plans allow users to enter long-term contracts to commit to EC2 instance usage and spend in exchange for a discount. This can be useful for users with predictable instance usage requirements. |
Fargate | Fargate is a serverless compute option available for EKS users. While this compute option is more expensive than EC2 instances, users may save on labor costs due to the difference in operational overhead. |
Optimizing pod costs | Pods consume hardware capacity from worker nodes, so optimizing pod configurations is important for maintaining cost efficiency. |
Vertically rightsizing pods | Pods should be rightsized to allocate the appropriate amount of resources, like CPU and memory. Rightsizing will avoid waste while ensuring optimal pod performance. |
Horizontally rightsizing pod replicas | Pod replicas should be scaled appropriately according to application demands. Autoscaling solutions should prevent having unnecessary replicas consume valuable compute capacity while ensuring that enough replicas are available for the application to operate correctly. |
Optimizing data transfer costs | Data transfer costs for AWS are often overlooked but can contribute significantly to the total cost of ownership for EKS clusters. |
Minimize cross-zone traffic | Traffic crossing availability zones incurs additional charges, so isolating traffic within a zone can be an effective cost optimization technique. |
Use VPC private endpoints | VPC endpoints allow users to connect to AWS services without traversing the Internet. Since Internet traffic incurs heavy charges from AWS, avoiding this will lead to significant cost savings. |
Use VPC peering or transit gateways | AWS charges more for traffic traversing the internet than for traffic in the internal network. Therefore, leveraging VPC peering and transit gateways is useful for reducing costs by ensuring network traffic remains within the AWS internal network. |
Implement a caching strategy | Caching responses from resources like databases can reduce unnecessary network traffic and data transfer charges while improving application performance. |
Monitor data transfer usage | Analyzing data transfer costs via AWS Cost Explorer or other observability tools will help identify patterns and anomalies as part of a cost optimization strategy. |
Leverage observability for cost optimization | Observability tools can gather data within an EKS cluster and surrounding AWS infrastructure to provide insight into performance bottlenecks, overallocated resources, long-term trends, and anomalous expenses. |
Worker nodes are the host machines
where the user’s Kubernetes pods are run. Each node in the cluster is responsible for connecting to the
control plane to fetch information related to Kubernetes resources such as pods, secrets, and volumes to
run on the host machine. Every EKS cluster must have at least one worker node to deploy the user’s pods.
The counterparts of worker nodes are called master nodes, which run all Kubernetes control plane
components, such as the Kube Scheduler, API Server, and Kube Controller Manager. Every EKS cluster will
have a fully AWS-managed control plane, so no master node configuration is required.
Worker nodes can be deployed as EC2 instances or serverless EKS Fargate nodes. This section
will discuss both compute options. Worker nodes typically contribute most of an EKS cluster’s cost and
must be carefully planned to mitigate unwanted expenses.
Spend less time optimizing Kubernetes Resources. Rely on AI-powered Kubex - an automated Kubernetes optimization platform
Free 60-day TrialThere are hundreds of EC2 instance types
available on AWS. Each instance type will have a unique hardware combination (CPU, memory, network, and
storage) to support specific use cases. Selecting instance types with insufficient resources will lead
to resource exhaustion issues and application downtime in an EKS cluster, while choosing instance types
that are too large will result in wasted resources and excess costs. A balanced approach to compute
capacity will maximize application performance while controlling costs.
Rightsizing worker nodes should be done based on resource utilization data. Rightsizing involves
analyzing EC2 instance utilization data to determine an appropriate instance type choice based on many
resource-related metrics. While this selection can be done manually, doing so will involve the
additional operational overhead of evaluating metrics data and may result in inaccurate rightsizing
decisions, especially with hundreds of EC2 instance types to select from. Implementing tools like Densify to
automate the rightsizing based on utilization data from a range of resource-related metrics will provide
more accurate instance type selections without the administrative overhead of manual decisions made by
administrators.
Implementing automated rightsizing for EC2 instances has the benefit of ensuring that an appropriate
instance type is being used to support the workloads of an EKS cluster. Workloads need correctly sized
instances to ensure that performance and scalability requirements are being met. Additionally,
rightsizing is critical for cost optimization by reducing unnecessary compute resources.
The table below provides a brief overview of the various EC2 instance categories available for AWS. Each
category of instance types is suitable for different use cases, and users can implement optimization
tools to select the appropriate instance types.
Instance type category | Description | Example instance types |
---|---|---|
General purpose | Provides a balance among compute, memory, and networking capacity. These instances are a good starting point for users without specialized use cases. |
T2, T3, M4, M5, A1 |
Compute optimized | These instances contain higher-performance processors with a large number of cores. | C4, C5, C6a |
Memory optimized | These instances can contain massive amounts of memory: up to 24 TB. | R4, R5, High Memory, z1d, R6a |
Accelerated computing | These instances contain GPUs, making them useful for use cases like machine learning. |
P2, P3, P4, G3, G5 |
Storage optimized | These instances contain high-performance NVME SSD storage for storage-intensive workloads. |
Im4gn, Is4gen, i4i |
HPC optimized | Instances for high performance computing (HPC) applications contain a combination of powerful processors, large memory capacity, and NVME SSD storage. |
Hpc6id, hpc6a |
Spot instances are an affordable approach to deploying EC2
compute capacity. Users can procure unused compute capacity from AWS, typically at a discount compared
to regular on-demand rates. AWS offers discounts for this type of instance because it essentially
represents spare data center capacity. The pricing difference can reach 90% and will vary depending on
the instance type, region, and the utilization of on-demand instances.
A significant drawback of spot instances to be aware of is the possibility of instance “interruption.”
This term is used to describe an event where AWS decides to reclaim its compute capacity and terminate
the spot instance. AWS may do this when it needs the capacity to serve on-demand instance requests.
Since on-demand instances always have a higher priority than spot instances, the user’s spot instance
will terminate to free up capacity for on-demand users.
This behavior is a significant change from on-demand instances, where users can typically expect
instances to run uninterrupted for extended periods (like years), only being interrupted if a hardware
failure occurs. Spot instances will provide no such assurances, and users may even experience multiple
interruptions per day.
To ensure that Kubernetes applications don’t experience downtime when a spot interruption occurs, users
are advised to set up Pod
Disruption Budgets. This Kubernetes resource enables graceful rescheduling and shutdown of pods
running on an interrupted spot instance.
Users can implement spot instances in their EKS clusters to benefit from cost savings by enabling this
feature in their AutoscalingGroups.
AutoscalingGroups support launching mixtures of spot and on-demand instances, enabling users to achieve
a balance between their needs for cost savings with the requirements for long-running instances.
Note that the challenges described in the previous section about selecting the correct node size apply as
much to spot instances as to reserved instances or those priced at the standard rate.
There are two approaches AWS offers to reduce the cost of EC2 instances by committing to long-term
contracts.
Reserved instances are an AWS feature
allowing users to obtain a discount for EC2 capacity in exchange for committing to long-term usage (one
year or three years). This feature is not specific to EKS but can assist EKS users in achieving
discounts on EC2 worker node capacity when the use case fits.
Reserved instances are available in two classes:
Reserved instances are beneficial for users who understand their long-term EC2 instance requirements and
are comfortable with committing to a one-year or three-year contract. The discounts for reserved
instances can reach 72%, making this a significant element of a cost optimization strategy if instance
usage is predictable.
That said, reserved instances are not suitable for use cases where workload requirements frequently
change. There are also additional factors users should consider before implementing reservations, such
as requirements related to term length, tenancy, and payment installments. These factors will impact the
discount amount.
Savings plans are a feature allowing users to
commit to a certain degree of compute spending for one-year and three-year periods.
There are two classes of savings plans:
The key differences between convertible reservations and instance savings plans are their flexibility and
operational overhead. When users want to change the instance type, convertible reservations require
changing the reservation details to ensure the discount is in effect. Instance savings plans don’t
require this additional step: Changing the instance type within the same family is automatically covered
by the instance savings plan. On the other hand, the flexibility of convertible reservations is higher
because it allows changing the instance family, whereas an instance savings plan cannot be modified.
Spend less time optimizing Kubernetes Resources. Rely on AI-powered Kubex - an automated Kubernetes optimization platform
Free 60-day TrialFargate is a serverless
compute option provided by AWS. It is a fully managed service where users do not need to manage EC2
instance resources to run pods in their EKS clusters. AWS manages the underlying Fargate compute
infrastructure, ensures that capacity is available, replaces unhealthy hosts, and applies security
patches.
Fargate is more expensive to run than an
equivalently sized EC2 instance (in terms of CPU and memory) and thus does not provide an immediate cost
benefit. However, Fargate can be a valuable contributor to a cost optimization strategy due to the
reduced operational overhead enabled by removing EC2 instances from an EKS cluster. Users no longer need
to invest valuable engineering time in maintaining EC2 fleets, which typically involves operations such
as scaling, monitoring, health checking, and upgrading. Since maintaining EC2 instances imposes labor
costs, implementing serverless infrastructure may be a valuable option for cost optimization.
Fargate supports specific use cases and is not suitable for all types of pods. Users should carefully
evaluate whether Fargate fits their needs before attempting a migration.
Rightsizing pods in an EKS cluster involves modifying the resources allocated to the pod to optimize the
balance between cost and performance. A pod is a collection of one or more containers grouped and
deployed to a worker node. All pods consume resources (such as CPU and memory) from the underlying
worker node, and pods requesting excessive resources will result in accruing unnecessary node compute
capacity charges.
Choosing the appropriate values for the pod resource requests and
limits in the Kubernetes manifest is a challenge because measuring utilization during peak hours
is a complex task, especially for short bursts. Tools like Densify are
designed to help users measure utilization peaks and valleys over time and avoid bottlenecks (if
under-provisioned) or financial waste (if over-provisioned). Moreover, as explained in this short video,
Densify integrates with infrastructure-as-code (IaC) tools such as Ansible and Terraform, making it easy
to automate the configuration process.
Pods can vertically scale by allocating more compute resources for the containerized applications to
consume. They can also be scaled horizontally by increasing the number of running pod replicas,
balancing the load between identical copies of the containerized application. Both approaches need to be
rightsized with correct values for effective cost optimization.
Vertically rightsizing pods involves modifying the CPU and memory resources configured in a pod’s
specification to match the pod’s resource requirements appropriately.
Setting the request and limit values too high will result in Kubernetes allocating excessive hardware
resources to the pod and wasting the unused compute capacity. This may cause node autoscaling solutions
to launch additional worker nodes in response to the excessive resource demands. Setting the request and
limit values too low will result in pods encountering performance issues such as CPU throttling,
out-of-memory issues, and potential eviction from the worker node.
Selecting accurate request and limit values is a challenge for administrators due to the administrative
overhead of manually reviewing metrics and making resource requirement estimations. Implementing
automation tools will reduce human error and save time for administrators, especially when there are
many pods to rightsize.
Users are advised to benchmark and load-test their pods to determine an appropriate level of resources to
define in the initial pod specification. Monitoring historical utilization metrics for a pod will
provide insight into the expected resource requirements, and this data can be used to experiment with
request and limit values. Historical metrics should also offer insight into average versus peak
utilization, which may indicate the resource allocation needs to dynamically scale up/down over time.
Optimization tools are recommended for accurately optimizing request and limit values in EKS clusters
rather than relying on manually selected values that may be error-prone and cause operational overhead.
For ongoing optimization, users should consider implementing tools like Densify that
can continuously analyze pod metrics and provide recommendations for configuring pod resource allocation
as the load on the application changes. This will reduce the operational overhead associated with
manually reviewing metrics to determine appropriate pod rightsizing values and improve the accuracy of
the rightsizing recommendations by using a wide range of metric data to determine appropriate values.
The example below displays how requests and limits are defined for a pod. These values should be
carefully evaluated based on metrics data to ensure pods have enough resources to perform without
overallocating resources.
apiVersion: v1
kind: Pod
metadata:
name: requests-limits-example
spec:
containers:
- name: my-container
image: nginx
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 200m
memory: 256Mi
Configuring requests and limits correctly in an EKS cluster will significantly impact worker node costs.
Ensuring that these values are set accurately via automation tools will benefit cost optimization
strategies and maintain application performance.
Spend less time optimizing Kubernetes Resources. Rely on AI-powered Kubex - an automated Kubernetes optimization platform
Free 60-day TrialHorizontally rightsizing pods in an EKS cluster involves adjusting the number of pod replicas to
accurately match the requirements of the user’s application. Rightsizing will help optimize resource
utilization and improve cost efficiency.
Users can manually configure the pod replica values (for example, in a Kubernetes Deployment object).
Selecting a value correctly will require understanding the application’s resource requirements.
Analyzing historical metrics data for the pod’s resource utilization will provide insight into whether
additional replicas are necessary to help handle the high load or whether the replica count can be
reduced to avoid waste.
Pod replica values can be set automatically with the proper tooling to optimize replica counts based on
current requirements. For example, pod replicas in a web application may need to be increased when
receiving high traffic, causing resource utilization to increase. Implementing tooling to automate the
replica values will help users optimize costs by mitigating unnecessary replicas when possible while
still enabling high performance by scaling up when required.
Data
transfer charges represent a significant “hidden” cost of AWS. Users often don’t consider these
costs until they receive an unexpected bill. Careful consideration of an EKS cluster’s network setup
will help contribute to cost optimization efforts by mitigating unnecessary data transfer charges.
Data transfers between EC2 instances within the same availability zone are free, but cross-zone traffic
has a cost. Users are advised to keep traffic between pods in the same zone if possible, especially if
traffic volume is high.
For example, a cluster may be running two applications as two separate pods that require high-volume
communication with each other, like a web application and a backend cache. Users can implement the
Kubernetes node
affinity and pod affinity features to ensure that these pods are placed within the same
availability zone or even the same worker nodes. This not only reduces cross-zone traffic costs but also
improves network latency. Users requiring high availability would need to deploy separate pods in other
availability zones to minimize downtime during a zone failure.
A VPC endpoint allows
users to connect to an AWS service directly through their VPCs without traffic traversing the Internet.
This feature is helpful because Internet traffic is charged at a high rate by AWS, so mitigating
unnecessary communication is recommended for cost optimization. VPC endpoints can also improve
performance because they offer lower network latency.
A per-hour cost is associated with creating a
VPC endpoint, so it’s only worthwhile to create for services that users heavily use. Analyzing the
user’s data transfer charges by service will provide insight into whether creating a VPC endpoint will
be cost-effective.
If the user’s EKS cluster requires communication with resources in other VPCs, setting up VPC peering or transit gateways
will help with cost optimization. Implementing inter-VPC connectivity will ensure that traffic stays
within the AWS network rather than traversing the Internet. As mentioned above, Internet traffic carries
a much higher data transfer cost, so implementing these connectivity features can lead to significant
cost savings.
Setting up appropriate caching, where possible, can significantly reduce network traffic and data
transfer costs. Caching in this context means storing and reusing data for time-consuming operations to
improve performance or mitigate unnecessary network activity. A caching strategy for containerized
applications will need to be designed based on the application’s architecture, but common approaches
include the following:
Monitoring data transfer charges via AWS Cost Explorer or other
monitoring tools will help users understand where traffic costs can be optimized. Regularly analyzing
these metrics may help identify other cost-saving opportunities. Data transfer is an often overlooked
cost that leads to unwanted surprises for users who aren’t carefully monitoring their utilization levels
and optimizing based on best practices.
Observability data, which is collected using tools such as AWS CloudWatch or Prometheus, is crucial for
EKS cost optimization because it offers insights into the performance and resource usage of cluster
applications and infrastructure. This data enables users to identify areas for cost optimization,
performance enhancement, and budget adherence. Implementing observability tools for EKS resources (such
as pods and nodes) and AWS infrastructure provides a comprehensive view of cluster operations and their
impact on costs.
Kubernetes observability tools assist with cost optimization by delivering valuable information on
cluster performance and resource utilization. Users need to be able to rightsize resources, detect
overallocated resources, and resolve performance bottlenecks based on cluster metrics. Additionally,
observability data can identify cost-related anomalies, such as sudden spikes in resource usage due to
misconfigured pods, enabling users to address potentially expensive issues early on.
Applying
labels to Kubernetes objects allows for categorization by use case, team, organization, etc.,
making it easier to analyze costs by breaking them down into separate sections.
Observability data also reveals long-term trends that aid in cost projections. This information helps
users understand cluster growth, resource utilization patterns, and future cost estimates, knowledge of
which is essential for budget planning and ensuring that costs remain within expectations.
That said, observability tools provide a great deal of metrics data, which may be overwhelming for
administrators to manually interpret and process into actionable cost optimization improvements.
Clusters with a large number of pods and nodes will produce significant quantities of observability data
(logs, metrics, and traces). High-volume observability data will be challenging for administrators to
interpret into accurate rightsizing recommendations, and the operational overhead will be significant.
There are also skill and expertise requirements to make accurate recommendations.
Implementing resource optimization tooling based on machine learning technologies such as Densify can
help provide actionable cost optimization recommendations in ways that are more accurate, repeatable,
and time-efficient than manual approaches. In addition, the implementation of these recommendations can
be automated. Optimization tools can evaluate a broad range of historical metrics data against
customer-defined rules, identify patterns, and provide recommendations more accurately and quickly than
a human administrator can. When it comes to leveraging observability data to implement node and pod
rightsizing changes, optimization tools are a useful asset for the cluster administrator to reduce
operational overhead and improve cost efficiency.
Incorporating observability tools for AWS services helps administrators track all cloud infrastructure
costs associated with an EKS cluster. Since cloud billing can be complex, users may need help to
pinpoint the sources of their expenses, particularly in intricate EKS clusters with numerous components.
Gaining insights into AWS service utilization, long-term trends, and potential anomalies is beneficial
for cost optimization efforts.
Spend less time optimizing Kubernetes Resources. Rely on AI-powered Kubex - an automated Kubernetes optimization platform
Free 60-day TrialUsers can follow the best practices above to optimize their EKS clusters and AWS infrastructure to
balance cost and performance. Determining optimal cluster configuration details can be done by
implementing the appropriate resource optimization tools to provide recommendations for more
cost-effective cluster designs, enabling administrators to maintain cost efficiency while meeting
workload performance requirements.