Provision AWS EKS with Terraform: Why & How

Guide to Kubernetes Tools
Chapter 2 Provision AWS EKS with Terraform

There’s a near-endless list of administrative tools for DevOps practitioners that need to provision, configure, and deploy resources. This article takes a closer look at one tool in particular: Terraform, and how it can be used to provision AWS EKS. Specifically, we’ll discuss:

  1. How Terraform compares to other tools for this task.
  2. Instructions to configure your first environment.
  3. Advanced best practices contributed by some of the most experienced administrators.

But first, let’s start with the basics.

What is AWS EKS?

At its core, AWS EKS is a service that provisions and manages the control plane (configuration files, API server, and controllers) of your Kubernetes (K8s) cluster. This means that you won’t have to worry about your control plane’s security, high availability, and upgrades. AWS EKS doesn’t automatically provision the K8s cluster’s worker nodes that host your workload, so you have the flexibility to provision them as you see fit.

What is Terraform?

Terraform is an open source tool, created by HashiCorp, that is used to declare the desired configuration state of your public and private application infrastructure. After you define a configuration state, Terraform provisions your environment and maintains the state of its changes over time. Terraform came to market as modern microservice-based immutable architecture took root. Since then, it has been gradually supplanting configuration management tools such as Chef, Puppet, and Ansible, as infrastructure administrators’ favorite helper. Terraform’s registry of integrations, also known as providers, boasts an impressive list.

TL;DR

In short, Terraform is the right tool to provision AWS EKS if you have already selected it as your standard cross-platform tool, or have decided to adopt an Infrastructure as Code approach and don’t want to be locked into an AWS-specific tooling. Otherwise, AWS eksctl is the fastest and simplest method, and CloudFormation for AWS EKS would be a more natural choice for AWS-only IaC.

Why should you use Terraform with AWS EKS?

As you may already know, there are other ways to provision a Kubernetes (a.k.a. K8s) cluster in AWS EKS). So why use Terraform? To answer that question, let’s first look at a few different tools that usually come up in this type of conversation, and their main purpose.

Tool Main Purpose Pro Con
Terraform Infrastructure Provisioning Leading infrastructure as code platform-independent provisioning tool with lots of integrations. Newer to Kubernetes and not specialized for it.
CloudFormation AWS Provisioning Native AWS infrastructure as code provisioning tool intended for use with AWS services. Not intended for multi-cloud use or data center infrastructure provisioning.
kubectl Native Kubernetes CLI Native K8s command line interface (CLI) that supports both imperative and declarative configuration files. Not intended for use outside of K8s, and naturally useful only after your cluster is provisioned.
eksctl AWS Kubernetes CLI Leverages the AWS CLI, Kubernetes API, and kubectl to automate provisioning of all required AWS resources. Designed to serve only as a CLI for AWS EKS.
Helm Kubernetes Package Manager Package manager for deploying applications inside an existing Kubernetes cluster. Not intended to provision a cluster, and requires to install server side code in the cluster.
kOps Kubernetes Cluster Management A pre-EKS tool for easily setting up a new K8s cluster on AWS. It has overlap with EKS’ core functionality.
Chef, Puppet, Ansible, SaltStack Infrastructure Configuration Management Widely used configuration management tools. Not intended for provisioning infrastructure as code, not specialized in either AWS or Kubernetes.

So, where does that leave us? Well, like most things, there’s some important nuance to understand about these tools—namely, the category of infrastructure tooling they fall under. Knowing each tool’s fundamental approach to solving a problem can help you better select the right tool to create the best (most scalable) solution for your specific use case. For example, a knife can also be used as a screwdriver—but it’s far from ideal.

Let’s take a look at the main categories these tools fall under.

Infrastructure Tooling Categories

In an effort to simplify the concepts, we are defining three categories to host above tools:

Infrastructure as Code (IaC)

Terraform and CloudFormation are both infrastructure as code provisioning tools that use declarative definitions and stateful version control to provision a complex environment configured in a desired state. This means that you can use it as a single tool for all of your needs. kOps would also fall in this category even though limited to K8s cluster management.

Infrastructure as Code (IaC) uses declarative statements to provision infrastructure and maintain its state.
The concept of Infrastructure as Code (IaC)

Package Manager

Helm, on the other hand, is a package manager for applications configured to run on a Kubernetes cluster. This means that, with an impressive list of packaged applications known as charts available in its artifact hub, you can quickly deploy applications on Kubernetes in the same way that Yum or Apt are used with Linux. Helm’s strength lies in its customized charts used for managing your specific application environments on K8s. However, it’s not intended to provision datacenter infrastructure or general cloud services.

Configuration Management

Chef, Puppet, Ansible, SaltStack are first-generation tools that were designed to automate infrastructure management when the architectures were mutable. In a mutable architecture, you deploy your systems and keep changing its configuration over time (thus configuration management). Modern architecture tends to be immutable, meaning it is not designed to evolve over time, but to be wiped away and replaced with an entirely new system.

It is common for an administrator to use tools in all three categories. For example, an administrator might:

  • Use Terraform to provision both private and public cloud infrastructure
  • Use Helm to deploy a customized application package on Kubernetes
  • Use Ansible to configure servers supporting a legacy application in their data center.

The benefits of deploying a Kubernetes cluster on AWS EKS

This brings us to another common question: What are the benefits of using AWS EKS when you can provision it on AWS EC2s or VMware VMs?

Even though AWS isn’t shy about promoting the many virtues of their offering, in our view, it boils down to a few essentials:

AWS EKS Feature Benefit
Identify and Access Management (IAM) IAM users and roles help you manage secure access by K8s objects
EKS Resilience EKS control plane is provisioned across Availability Zones for resilience
Cluster Autoscaler Automates the provisioning and termination of nodes based on workload
EBS Fast Snapshots A quicker way to backup both your configuration (etcd) and data volumes
EKS Fargate Eliminate the need for you to manage Kubernetes worker nodes and pods
EKS Anywhere Allows you to extends a Kubernetes cluster to VMs in your data center

Look for our upcoming article that expands on each of the topics presented in this table.

How to setup an EKS Kubernetes cluster using Terraform

Now that you have some context, let’s set up an EKS Cluster. Setting up an EKS Cluster is fairly simple. The first prerequisite is to have Terraform already installed. You can refer to the following link to install Terraform on your server.

The second requirement is to configure aws credentials on your system. To configure AWS credentials, you will need an IAM user/role with the required permissions.

Once you have Terraform and AWS Credentials configured on your system, you are all set to create an EKS Cluster on AWS using Terraform.

Create an EKS Cluster

  1. Download sample Terraform configuration
  2. Download and configure the providers for Terraform.
    • terraform init
  3. Create the EKS Cluster
    • terraform apply
  4. Configure kubectl
    • aws eks --region $(terraform output region) update-kubeconfig --name $(terraform output cluster_name)

Below is a Terraform configuration file with a few preset sample options:

provider "aws" {
  region  = "us-east-1"
  version = "2.3.0"
}

module "eks_k8s1" {
  source  = "terraform-aws-modules/eks/aws"
  version = "2.3.1"

  cluster_version = "1.12"

  cluster_name = "k8s"
  Vpc_id = "vpc-00000000"

  subnets = ["subnet-00000001", "subnet-000000002", "subnet-000000003"]

  cluster_endpoint_private_access = "true"
  cluster_endpoint_public_access  = "true"

  write_kubeconfig      = true
  config_output_path    = "/.kube/"
  manage_aws_auth       = true
  write_aws_auth_config = true

  map_users = [
    {
      user_arn = "arn:aws:iam::12345678901:user/user1"
      username = "user1"
      group    = "system:masters"
    },
  ]

  worker_groups = [
    {
      name                 = "workers"
      instance_type        = "t2.large"
      asg_min_size         = 3
      asg_desired_capacity = 3
      asg_max_size         = 3
      root_volume_size     = 100
      root_volume_type     = "gp2"
      ami_id               = "ami-0000000000"
      ebs_optimized     = false
      key_name          = "all"
      enable_monitoring = false
    },
  ]

  tags = {
    Cluster = "k8s"
  }
}
Example Terraform config

And that’s it! You’ve provisioned your first EKS Cluster. You can review additional related reading materials, such as:

Tips for Using Terraform with AWS EKS

Use S3 as the backend for storing .tfstate file

Since you are provisioning a K8s cluster in AWS, you might as well take full advantage of the AWS services available. Use S3 replication to safeguard your Terraform state file. This prevents you from having to recreate your entire cluster if you were to ever lose access to your state file. S3 replication allows you to replicate the file within or across Availability Zones to help avoid file corruption or a disaster.

Use locking of state file

Remote state locking helps prevent concurrent operations on your resources. If you have configured the S3 backend to store a state file and attempt to perform the update operation on the same EKS cluster concurrently, locking will prevent other administrators in your environment from corrupting your state file.

With Terraform remote state locking, the state file is stored in an S3 bucket and the key is fetched from DynamoDB.
Locking the Terraform state file to avoid race conditions.

Tag your EKS clusters and nodes

You may have multiple EKS clusters serving different purposes (e.g., a production vs. a development environment). In such a scenario, tagging EKS clusters helps organize your cluster based on nodes and user profiles, and also helps with chargeback when used in conjunction with Cost Allocation tags and AWS Cost Explorer. You may already be using Labels within your K8s cluster for a more granular organization of your workload resources and K8s objects, in which case AWS tagging is simply a way for you to track your EKS at a cluster, node, and user profile levels.

Use multi AWS accounts

As part of its well-architected framework, AWS has been promoting a best practice of using multiple AWS accounts to organize your resources. For example, you would deploy an EKS cluster in your production account and another in your development account, or may you separate your accounts to align with cost centers. Fortunately, supporting multiple “providers” is a strength of Terraform. You would then create an IAM Role for each of your administrators for a particular account and use them in your Terraform script similarly to the example below.

provider “aws” {
  region = “us-east-1”
  assume_role {
    role_arn = “arn:aws:iam::123456789012:role/iac”
  }
}
Creating an IAM Role for each admin per account

Use Terraform modules and registry

Terraform offers a feature named a “module” that allows you to organize your resources in module blocks which can be called by other modules to help avoid duplication in your state file. Another advantage of organizing your state file into modules is to align with Terraform registry’s approach of offering third-party configuration in the form of modules. The Terraform registry contains plugins called “Terraform providers” that add resource types (such as AWS VPC).

Use multiple Terraform states for more control

Some of your infrastructure components (such as AWS VPC, ELB, or databases) don’t change often in configuration, while others (such as containers) change frequently. Using multiple state files to separate the control of their respective configuration has a couple of advantages:

  • It improves performance since Terraform doesn’t have to collect states for all components to change only one.
  • It reduces the likelihood of errors from frequent changes made in a continuous deployment SDLC.

This is commonly performed using Terraform Workspaces.

Use the AWS EBS CSI driver to support stateful applications

Kubernetes provides StatefulSet to enable the implementation of stateful applications that reference common data as containers are added to scale horizontally via the EKS Cluster Autoscaler. This approach in turn requires persistent storage of data such as a cache or a database. AWS supports this use case with the EBS Container Storage Interface (CSI) which is still in beta as of the time we are writing this article, and a most valuable ingredient for deploying stateful applications on AWS EKS.

Lower your cluster node costs

There are many ways for you to pay less for the AWS EC2s that support your K8s cluster. One common option is to obtain up to 72% discounts by committing to a certain volume of usage at least one year in advance via Savings Plan or Reserved Instances.

Another approach is to use AWS Spot instances for discounts up to 90% which is especially handy for workloads that can tolerate delay such as batch jobs. You must remember that the nodes may be taken away with a mere 2 minutes notice. So remember to increase your spot instance bid to decrease the chance of being outbid, and use Mixed Instance Policies. This helps increase your chance of securing a spot instance in case of shortage of certain types on the spot market. The spot instance selector can help you programmatically find similar types in the vast expanding universe of EC2 types and sizes.

For a more sophisticated approach to sizing and cost management across multiple cloud providers and data centers, you may also use third-party tooling vendors who offer policy-driven optimization-as-code functionalities that integrate with Terraform to right-size your environment starting from your cluster nodes all the way up to your containers.

Scale not only horizontally, but also vertically

We discussed the idea of auto scaling cluster nodes—but what about scaling pods? A lack of attention to your pod-level rightsizing ultimately results in unnecessary wasted nodes via the cluster autoscaler. Kubernetes provides the Vertical Pod Autoscaler (VPA) that can adjust up and down pod resource requests based on historic CPU and memory usage. It can also automatically keep CPU and memory resource limits proportional to resource requests. This helps with over-requesting resources to save money, but also with under-requesting resources which can cause performance bottlenecks.The VPA feature is supported in AWS EKS by installing the Metrics Server. Once enabled, you must be careful to not allow Terraform to override VPA’s functionality, if both try to simultaneously drive the resource requirement settings.

Vertical Pod Autoscaler (VPA) adjusts the Kubernetes pods’ sizes, while Horizontal Pod Autoscaler (HPA) adjusts pod counts
HPA adds more pods while VPA resizes the pods

Conclusion

Public cloud, Infrastructure as Code (IaC) and Kubernetes are the three most important infrastructure architecture trends in recent years. Fortunately, the technologies have sufficiently matured to automate mundane tasks and play nicely together using Terraform and AWS EKS. As your environment scales over time, a common challenge becomes to size your containers, nodes and clusters accurately to ensure high performance while avoiding waste, which is where third-party vendors come in to help.

Continue Reading this Series