Guide to Kubernetes Autoscaling

K8s ResourceQuota Object

Chapter 4

ResourceQuota is an object in Kubernetes that enables administrators to restrict cluster tenants’ resource usage per namespace. Namespaces create virtual clusters within a physical Kubernetes cluster to help users avoid resource naming conflicts and manage capacity, among others.

In this article, we explore Kubernetes resource quota configuration requirements to govern resource consumption by namespace. We have included demo codes illustrating how to use .yaml files with kubectl to configure constraints on hardware usage.

Resource Quota Overview

A Kubernetes cluster has a limited amount of available hardware resources. Hardware resources are measured based on worker nodes with a specific number of CPU cores or RAM allocation.

In a shared Kubernetes environment, it is important to pre-define the allocation of resources for each tenant to avoid unintended resource contention and depletion.

By utilizing this method, each tenant in a shared Kubernetes cluster can be configured with specific settings for maximum allowed resource consumption.

To put resource limits on individual containers, use the Kubernetes resource requests and limits associated with pods within each namespace.

Spend less time optimizing Kubernetes Resources. Rely on AI-powered Kubex - an automated Kubernetes optimization platform

Free 60-day Trial

Namespace Resource Quotas vs. Pod Requests & Limits

A cluster can contain multiple namespaces intended for administrative separation. Each namespace can in turn contain multiple pods of containers. The namespace and the pod each have their own resource configuration files. The namespace resource quota governs the maximum use of computing resources by the namespace, while the pod request and limits govern the use of computing resources by the containers within each pod.

A Kubernetes resource request establishes the amount of computing resources a pod is reserving for itself at creation time. This configuration is important because it helps the Kubernetes scheduler make smart decisions by placing an appropriate number of pods on each node. The resource limit is the upper bound on the CPU or RAM usage a pod can possibly use.

How to use resource requests and limits to manage container resource usage

A pod quota can be defined based on resource request and limit values in a YAML file referenced by the Kubernetes JSON API.

How Resource Quota Limits Work

Requests for the creation, deletion, and update of system resources go through the Kubernetes API server. There are different admission controllers that can view and filter the requests. The quota operates until the resource limit is reached or violated.

Once the resource quota object is defined on a namespace by the Kubernetes cluster administrator, the Kubernetes quota admission controller watches for the new objects created in that namespace. Then it will keep monitoring and tracking resource usage.

Enforcing bespoke admission control policies in Kubernetes

If a user or process tries to acquire more of that resource, the Kubernetes admission controller will throw an error or exception and will not allow that operation.

Practical Usage of Resource Quotas

The following points summarize the main steps to understanding how Kubernetes resource quota configuration works in practice:

User teams are assigned to different namespaces to deploy resources.
The administrator creates one ResourceQuota object for each namespace.
Users create resources such as pods and services in their assigned namespace.
The quota system tracks total system resource usage to ensure it does not exceed the hard resource limits defined in the ResourceQuota object.
If an API request tries to create or update a resource that violates a quota, the request will fail with a message explaining the constraint violation.
If the quota is enabled in a namespace for computing resources like CPU and RAM, authorized users must specify requests or limits for those values.
Without user authorization, the quota system will not allow pod creation.

ResourceQuota object support is enabled by default on most Kubernetes distributions. It can be enabled manually by setting the API server. For example, the command --enable-admission-plugins= flag has the ResourceQuota object as its target argument.

A resource quota is enforced in a particular namespace when there is a ResourceQuota object in that namespace. This file can be utilized in different ways in configuration.

Spend less time optimizing Kubernetes Resources. Rely on AI-powered Kubex - an automated Kubernetes optimization platform

Free 60-day Trial

How to use the Resource Quota

In this section, we go through a demo example of how to create and define the CPU resource quota on a namespace with requests and limits.

We will then try to exceed the quota through web traffic spikes to show that once the defined object limit is reached, no more resources using that quota can be created.

Demo Code

First, check to see if you have an active Kubernetes cluster up and running. We have some worker nodes available for the code. At least one worker node is required to perform this demo.

C02W84XMHTD5:terraform-dev iahmad$ kubectl get nodes
  NAME                        STATUS   ROLES    AGE    VERSION
  autoscale-concourse-vmxh   Ready    <none>   4m    v1.21.2
  autoscale-default-vm37     Ready    <none>   56m   v1.21.2

Next, create a namespace to test Kubernetes resource quotas, quota-demo:

C02W84XMHTD5:terraform-dev iahmad$ kubectl create namespace quota-demo
  namespace/quota-demo created

Then, define and create the CPU quota on a namespace:

C02W84XMHTD5:terraform-dev iahmad$ cat cpu-quota.yaml 
  apiVersion: v1
  kind: ResourceQuota
  metadata:
    name: test-cpu-quota
    namespace: quota-demo
  spec:
    hard:
      requests.cpu: "200m"  
      limits.cpu: "300m"
  C02W84XMHTD5:terraform-dev iahmad$ 
  C02W84XMHTD5:terraform-dev iahmad$ kubectl create -f cpu-quota.yaml
  resourcequota/test-cpu-quota created

You can verify the ResourceQuota object has been created. Note: the “used” column, as initially no quota is used in the configuration and the namespace is populated as empty.

Spend less time optimizing Kubernetes Resources. Rely on AI-powered Kubex - an automated Kubernetes optimization platform

Free 60-day Trial

C02W84XMHTD5:terraform-dev iahmad$ kubectl describe resourcequota/test-cpu-quota --namespace quota-demo
  Name:         test-cpu-quota
  Namespace:    quota-demo
  Resource      Used  Hard
  --------      ----  ----
  limits.cpu    0     300m
  requests.cpu  0     200m

Thereafter, you can create a test pod with requests and limits defined as shown below:

C02W84XMHTD5:terraform-dev iahmad$ kubectl create -n quota-demo -f- <<EOF
  apiVersion: v1
  kind: Pod
  metadata:
    name: testpod1
  spec:
    containers:
    - name: quota-test
      image: busybox
      imagePullPolicy: IfNotPresent
      command: ['sh', '-c', 'echo Pod is Running ; sleep 5000']
      resources:
        requests:
          cpu: "100m"
        limits:
          cpu: "200m"
    restartPolicy: Never
  EOF
  pod/testpod1 created

See the updated settings on the namespace and note the data displayed in the “used” column. You will now notice a difference, with the new pod just created having used some of the quota:

C02W84XMHTD5:terraform-dev iahmad$ kubectl describe resourcequota/test-cpu-quota --namespace quota-demo
  Name:         test-cpu-quota
  Namespace:    quota-demo
  Resource      Used  Hard
  --------      ----  ----
  limits.cpu    200m  300m
  requests.cpu  100m  200m

To continue the process, create another pod and observe the remaining quota values:

C02W84XMHTD5:terraform-dev iahmad$ kubectl create -n quota-demo -f- <<EOF
  apiVersion: v1
  kind: Pod
  metadata:
    name: testpod2
  spec:
    containers:
    - name: quota-test
      image: busybox
      imagePullPolicy: IfNotPresent
      command: ['sh', '-c', 'echo Pod is Running ; sleep 5000']
      resources:
        requests:
          cpu: "10m"
        limits:
          cpu: "20m"
    restartPolicy: Never
  EOF
  
  pod/testpod2 created
  C02W84XMHTD5:terraform-dev iahmad$ 
  C02W84XMHTD5:terraform-dev iahmad$ 
  C02W84XMHTD5:terraform-dev iahmad$ 
  C02W84XMHTD5:terraform-dev iahmad$ kubectl describe resourcequota/test-cpu-quota --namespace quota-demo
  Name:         test-cpu-quota
  Namespace:    quota-demo
  Resource      Used  Hard
  --------      ----  ----
  limits.cpu    220m  300m
  requests.cpu  110m  200m

As noted above, the used column is now updated again. The new pod is now listed as having consumed more of the available quota limits for the total allocated CPU resources.

If we try to create a pod requesting more resources than what’s available in quota, we will now receive an error message that states we don’t have enough quota left to create the new pod:

C02W84XMHTD5:terraform-dev iahmad$ kubectl create -n quota-demo -f- <<EOF
  apiVersion: v1
  kind: Pod
  metadata:
    name: testpod3
  spec:
    containers:
    - name: quota-test
      image: busybox
      imagePullPolicy: IfNotPresent
      command: ['sh', '-c', 'echo Pod is Running ; sleep 5000']
      resources:
        requests:
          cpu: "100m"
        limits:
          cpu: "200m"
    restartPolicy: Never
  EOF
  
  
  Error from server (Forbidden): error when creating "STDIN": pods "testpod3" is forbidden: exceeded quota: test-cpu-quota, requested: limits.cpu=200m,requests.cpu=100m, used: limits.cpu=220m,requests.cpu=110m, limited: limits.cpu=300m,requests.cpu=200m
  C02W84XMHTD5:terraform-dev iahmad$

Spend less time optimizing Kubernetes Resources. Rely on AI-powered Kubex - an automated Kubernetes optimization platform

Free 60-day Trial

Finally, do the clean-up of the installation and configuration files by deleting the namespace and resources contained in it. You can use the command below with your configuration variables:

C02W84XMHTD5:terraform-dev iahmad$ kubectl delete ns quota-demo --cascade
  namespace "quota-demo" deleted

Best Practices

If you follow the best practice of defining and putting resource requests and limits into ResourceQuota object files to establish configuration settings, the Kubernetes cluster will be more stable and there will be fewer disruptions
Developers and DevOps people on your team should perform CPU and memory profiling to test application resources requirements in advance. Then communicate those numbers to the cluster-admin so that optimal values for requests, limits, and quotas can be configured for Kubernetes runtimes.
DevOps teams should monitor the actual usage of resources as compared to the allocated (or reserved) capacity to ensure that resources aren’t wasted.