Amazon Elastic Compute Cloud (EC2) is an AWS service offering that delivers secure and scalable cloud compute capacity.
Each of your EC2 instances is a virtual server providing compute power capable of running apps within the AWS public cloud.
EC2 instances are launched by created by an Amazon Machine Image (AMI)—an AWS template that describes and defines the OS and operating environment for one or more EC2 instances of one or more EC2 instance types.
Each instance type delivers a mix of CPU, memory, storage, and networking capacity, across one or more size options, and should be carefully matched to your workload’s unique demands.
AWS groups instances into families that offer discreet abilities for your workloads:
The elastic designation in Elastic Compute Cloud refers to the ability to increase your EC2 instance footprint on demand—up or down—manually, or automatically through Auto Scaling.
At the time of writing, there are nearly 280 EC2 instance types available across nearly 25 regions and 70 availability zones—yielding millions of possible permutations to choose from when optimizing the EC2 instance selection for your workload.
So, how do you choose?
Although other public cloud providers may leverage different groupings and terminologies for their compute service offerings, the general concepts outlined below will apply.
General purpose instances are designed for scalable services, such as web servers, microservices, and distributed data stores. Within the general purpose category, there are A1, T3, T3a, T2, M5, M5a, and M4 instance types.
The A1 category is used for Arm processor-based workloads. The T3s are burstable general purpose instance types that provide a baseline of CPU performance, but can burst to higher levels of CPU performance during spikes in CPU demand.
Current-generation A1 instance offerings include a1.medium, a1.large, a1.xlarge, a1.2xlarge, a1.4xlarge, and a1.metal.
T3 instances, and the next-generation version, T3a, are configured with a balance of CPU, memory, and networking resources. There are 7 T3 instance types, ranging from t3.nano with two virtual CPUs and 0.5GB of memory to t3.2xlarge with eight virtual CPUs and 32GB of memory.
T3 instance types are configured by default to increase CPU bursting without limit. This helps prevent insufficient CPU, but also leaves customers at risk of paying more than they have to for the same level of CPU resources. The T3 instances use the Nitro system, which enables network and EBS bursting, as well.
Current-generation T3 instance offerings include t2.nano, t2.micro, t2.small, t2.medium, t2.large, t2.xlarge, t2.2xlarge, t3.nano, t3.micro, t3.small, t3.medium, t3.large, t3.xlarge, t3.2xlarge, t3a.nano, t3a.micro, t3a.small, t3a.medium, t3a.large, t3a.xlarge, and t3a.2xlarge.
The M5, M5a, and M4 instances are also balanced CPU and memory instances. They’re designed for small and midsize databases and back-end applications.
Current-generation M instance offerings include m4.large, m4.xlarge, m4.2xlarge, m4.4xlarge, m4.10xlarge, m4.16xlarge, m5.large, m5.xlarge, m5.2xlarge, m5.4xlarge, m5.8xlarge, m5.12xlarge, m5.16xlarge, m5.24xlarge, m5.metal, m5a.large, m5a.xlarge, m5a.2xlarge, m5a.4xlarge, m5a.8xlarge, m5a.12xlarge, m5a.16xlarge, m5a.24xlarge, m5ad.large, m5ad.xlarge, m5ad.2xlarge, m5ad.4xlarge, m5ad.8xlarge, m5ad.12xlarge, m5ad.16xlarge, m5ad.24xlarge, m5d.large, m5d.xlarge, m5d.2xlarge, m5d.4xlarge, m5d.8xlarge, m5d.12xlarge, m5d.16xlarge, m5d.24xlarge, m5d.metal, m5dn.large, m5dn.xlarge, m5dn.2xlarge, m5dn.4xlarge, m5dn.8xlarge, m5dn.12xlarge, m5dn.16xlarge, m5dn.24xlarge, m5n.large, m5n.xlarge, m5n.2xlarge, m5n.4xlarge, m5n.8xlarge, m5n.12xlarge, m5n.16xlarge, and m5n.24xlarge.
The C5, C5n, and C4 instance types offer the lowest price per CPU ratio of the instance types. These are designed for compute-intensive workloads, like batch processing, data analytics, machine learning, and high-performance computing. The C5 come in nine different models, from the c5.large with two virtual CPUs and 4GB of memory to the c5d.18xlarge with 72 virtual CPUs and 144GB of memory.
Current-generation C instance offerings include c4.large, c4.xlarge, c4.2xlarge, c4.4xlarge, c4.8xlarge, c5.large, c5.xlarge, c5.2xlarge, c5.4xlarge, c5.9xlarge, c5.12xlarge, c5.18xlarge, c5.24xlarge, c5.metal, c5d.large, c5d.xlarge, c5d.2xlarge, c5d.4xlarge, c5d.9xlarge, c5d.12xlarge, c5d.18xlarge, c5d.24xlarge, c5d.metal, c5n.large, c5n.xlarge, c5n.2xlarge, c5n.4xlarge, c5n.9xlarge, c5n.18xlarge, and c5n.metal.
The R, X, and Z families of instances are memory optimized. These are designed for memory-intensive applications such as databases and real-time stream processing. The R5 family has 18 different models, starting with the r5.large with two virtual CPUs and 16GB of memory, going up to the r5d.24xlarge and r5d.metal, both with 96 virtual CPUs and 768GB of memory. The X1e and X1 are well-suited for database servers with up to 128 virtual CPUs and 3,904GB of memory. The Z1D is a specialty instance with a 4.0GHz sustained frequency; it’s designed for applications with high per-core licensing costs.
Current-generation R, U, X,and Z instance offerings include r4.large, r4.xlarge, r4.2xlarge, r4.4xlarge, r4.8xlarge, r4.16xlarge, r5.large, r5.xlarge, r5.2xlarge, r5.4xlarge, r5.8xlarge, r5.12xlarge, r5.16xlarge, r5.24xlarge, r5.metal, r5a.large, r5a.xlarge, r5a.2xlarge, r5a.4xlarge, r5a.8xlarge, r5a.12xlarge, r5a.16xlarge, r5a.24xlarge, r5ad.large, r5ad.xlarge, r5ad.2xlarge, r5ad.4xlarge, r5ad.12xlarge, r5ad.24xlarge, r5d.large, r5d.xlarge, r5d.2xlarge, r5d.4xlarge, r5d.8xlarge, r5d.12xlarge, r5d.16xlarge, r5d.24xlarge, r5d.metal, r5dn.large, r5dn.xlarge, r5dn.2xlarge, r5dn.4xlarge, r5dn.8xlarge, r5dn.12xlarge, r5dn.16xlarge, r5dn.24xlarge, r5n.large, r5n.xlarge, r5n.2xlarge, r5n.4xlarge, r5n.8xlarge, r5n.12xlarge, r5n.16xlarge, r5n.24xlarge, u-6tb1.metal, u-9tb1.metal, u-12tb1.metal, u-18tb1.metal, u-24tb1.metal, x1.16xlarge, x1.32xlarge, x1e.xlarge, x1e.2xlarge, x1e.4xlarge, x1e.8xlarge, x1e.16xlarge, x1e.32xlarge, z1d.large, z1d.xlarge, z1d.2xlarge, z1d.3xlarge, z1d.6xlarge, z1d.12xlarge, and z1d.metal.
Accelerated computing instances provide graphics processing units (GPUs) or field programmable gate arrays (FPGAs) that are used for machine learning, high-performance computing, and other numerically intensive workloads. The P2, P3, G3, and F1 families have accelerated computing instances. The P2, P3, and G3 families each has three or four models with one to 64 GPUs, four to 96 CPUs, and 30.5GB to 768GB of memory.
Current-generation F, G, inf, and P instance offerings include f1.2xlarge, f1.4xlarge, f1.16xlarge, g3s.xlarge, g3.4xlarge, g3.8xlarge, g3.16xlarge, g4dn.xlarge, g4dn.2xlarge, g4dn.4xlarge, g4dn.8xlarge, g4dn.12xlarge, g4dn.16xlarge, p2.xlarge, p2.8xlarge, p2.16xlarge, p3.2xlarge, p3.8xlarge, p3.16xlarge, p3dn.24xlarge, inf1.xlarge, inf1.2xlarge, inf1.6xlarge, and inf1.24xlarge.
The I, D, and H families make up the storage optimized instances. The I3 and I3en use non-volatile memory express (NVMe) SSD storage. These devices are optimized for low latency, random I/O and high sequential reads. The D2 instances are backed by hard disk drives and offer large-volume, low-cost persistent storage with up to 48TB per instance. The H1 instances have up to 16TB of hard disk drive storage designed for high disk throughput.
Current-generation D, H, and I instance offerings include d2.xlarge, d2.2xlarge, d2.4xlarge, d2.8xlarge, h1.2xlarge, h1.4xlarge, h1.8xlarge, h1.16xlarge, i3.large, i3.xlarge, i3.2xlarge, i3.4xlarge, i3.8xlarge, i3.16xlarge, i3.metal, i3en.large, i3en.xlarge, i3en.2xlarge, i3en.3xlarge, i3en.6xlarge, i3en.12xlarge, i3en.24xlarge, and i3en.metal.
A few patterns emerge when describing the different kinds of instances you can use in AWS. First, there are a large number of instances. Even within a single family of instances, there can be up to 18 different configurations. This makes it difficult to find the optimal instance type for a given workload.
For example, you may have a compute-intensive application, but you may not be sure if you should use a compute-optimized instance or an accelerated instance. If the application makes many floating point calculations and the code can take advantage of a GPU, an accelerated instance may be the best option. In another scenario, you may want to deploy a distributed computing application that’s I/O-intensive; should you use an SSD or hard disk drive-backed instance? That will depend on the importance of low latency I/O balanced against cost considerations.
These characteristics are key to doing accurate comparisons between different instance types, which can help you decide which kind of instance is best for a particular workload. Also consider technical constraints on different instances, such as which images run on particular instances; if EBS and networking can burst; and the limits of local storage. Finally, consider business policies that might be in effect that limit your options. IaaS clouds offer a significant opportunity to reduce infrastructure costs and increase an enterprise’s ability to adapt to changes in business conditions and related demands on infrastructure. The wide array of choices in choosing an instance type can be bewildering. Mistakes can be costly, as well as degrade performance.
To make an optimal selection in instance type, you need to understand the detailed nuances of the different instance types and models, as well as your workload’s characteristics—including short-term burst in CPU load and longer-term variations in workload due to business cycles. For small numbers of instances this can be done manually, but as environments grow, this can be very challenging.
Densify helps enterprises manage EC2 instance selection at scale, providing recommendations for the best-fit type for each of your workloads. Request a demo, and we'll show you the power of machine-learning-driven EC2 instance selection.Request a Demo