Anti-Affinity OpenShift: Tutorial & Instructions

Guide to OpenShift Tutorial
Chapter 9 Anti-Affinity OpenShift

In OpenShift, part of deploying a pod is deciding where to schedule the pod. Pod scheduling is handled by the openshift-kube-scheduler pods running on each controller node. By default, any workload can be placed on any worker node. However, administrators may want to control pod placement. 

There are three common techniques to solve this problem: node affinity, pod affinity, and pod anti-affinity. 

OpenShift Pod Placement Techniques 

Technique Description 
Node Affinity Schedule pods where the node matches specific labels.
Pod Affinity Schedule pods on nodes where matching pod labels already exist.
Pod Anti-Affinity Avoid scheduling pods where matching pod labels already exist.

This article will cover pod anti-affinity, which is the practice of scheduling pods while avoiding other pods based on labels. We will review the most common anti-affinity OpenShift use cases and the pod specs required to configure pod anti-affinity. We will also review what happens when a pod fails to schedule because of pod anti-affinity rules.

Anti-affinity OpenShift Use Cases

Before discussing how to configure pod anti-affinity, let us discuss why you might want such specific control over pod scheduling. While specific requirements vary depending on  infrastructure, available resources, and workload needs, the two most common anti-affinity OpenShift use cases are high availability and pod pairing. 

Anti-affinity OpenShift Use Cases

Use Case Description
High Availability Ensure multiple pods are not running on the same node.
Pod Pairing In some cases, pod anti-affinity rules can be combined with pod-affinity rules to pair specific sets of pods together and still distribute them across the cluster.

High Availability

Pods may need to be on different nodes to ensure high availability for databases or caching applications. In these cases, the anti-affinity rules would be configured to avoid pods where the application is already running. One example is configuring one elastic search pod per OpenShift cluster node. Another example is Percona including anti-affinity in its operator.

Pod Pairing

Sometimes, it is beneficial to pair pods. For example, you may want your Python application paired with a Redis cache, but you only want one cache pod and one application pod per node. In that case, using both pod affinity and pod anti-affinity is ideal.

Configuring Pod Anti-Affinity in OpenShift

Now that we know the why, let’s move on to the how. This section will review the files required to configure pod anti-affinity in OpenShift/ 

We will use Deployments to configure and launch our pods for our examples. If you are unfamiliar with Deployments, consult the OpenShift documentation.

This first example is the configuration lines that specify the pod anti-affinity section.


        [1] affinity:
        [2]  podAntiAffinity:
        [3]    requiredDuringSchedulingIgnoredDuringExecution:
        [4]    - labelSelector:
        [5]        matchExpressions:
        [6]        - key: app
        [7]          operator: In
        [8]          values:
        [9]          - hello-openshift
        [10]      topologyKey: "kubernetes.io/hostname"                             
    
Line # Content  Description
1 affinity: This is the start of the affinity section. Part of the pod specs.
2 podAntiAffinity: Specifying we are configuring pod anti-affinity
3 requiredDuringSchedulingIgnoredDuringExecution: The option "requiredDuringSchedulingIgnoredDuringExecution" is long but breaks down to the following two points.
1. Do not schedule the pod unless the rules are met 2. If the node labels change after the pod is running, the pod can still run. The option “preferredDuringSchedulingIgnoredDuringExecution” will try to match the rules, but it will still be scheduled if no node matches.
4 - labelSelector: Configure label selection.
5 matchExpressions: Setup the labels we need to match
6 - key: app The label we want to compare, in this case, app
7 operator: In How we want to compare the label, in this case, match the value.
8 values: List of values to match.
9 - hello-openshift The value to match. In this case, “hello-openshif”t. For multiple values, any of them can match.
10 topologyKey: "kubernetes.io/hostname" Similar to nodeSelector, the topologyKey matches node labels. In this case, we are picking a label that should be on all nodes by default. This ensures all the nodes are in scope for selection.

So to review, this configuration section states that the pod should not be scheduled on a node where a pod has the label app=hello-openshift.

Below is an example of the deployment as a complete file. This file creates a deployment and launches pods with the app=hello-openshift label.

The podAntiAffinity rules will dictate that if a pod with the app=hello-openshift label is already on the node, it will not place another pod on that node.

apiVersion: apps/v1
            kind: Deployment
            metadata:
              name: hello-openshift
            spec:
              replicas: 1
              selector:
                matchLabels:
                  app: hello-openshift
              template:
                metadata:
                  labels:
                    app: hello-openshift
                spec:
                    affinity:
                    podAntiAffinity:
                      requiredDuringSchedulingIgnoredDuringExecution:
                      - labelSelector:
                          matchExpressions:
                          - key: app
                            operator: In
                            values:
                            - hello-openshift
                        topologyKey: "kubernetes.io/hostname"
                  containers:
                  - name: hello-openshift
                    image: openshift/hello-openshift:latest
                    ports:
                    - containerPort: 80                          
                  

Notice here we only have one replica configured. Our configuration will only allow one pod per node. If we increase the replicas to four, one pod will not be scheduled due to the rules. We’ll review what happens in that case in the When a Pod Cannot Be Scheduled section below.

Get Your Free Trial

Quickly and easily run a Densify trial alongside any other solution and we will prove why Densify is the gold standard for optimization. See results in less than 48 hours.

Start Now

Automated, Intelligent Container Sizing

Kubernetes Autoscaling doesn’t recommend pod limit values or consider I/O. Densify identifies mis-provisioned containers at a glance and prescribes the optimal configuration.

Densify has partnered with Intel to offer one year of free resource optimization software licensing to qualified companies.

Automated, Intelligent Container Sizing
  • Amazon Web Services
  • Microsoft Azure
  • Google Cloud Platform

This next example combines pod affinity and pod anti-affinity together to pair pods together.

This Deployment file launches three Redis caching pods on nodes where the key/label app=cache does not already exist. Effectively one pod per node.

apiVersion: apps/v1
            kind: Deployment
            metadata:
              name: cache
            spec:
              selector:
                matchLabels:
                  app: cache
              replicas: 3
              template:
                metadata:
                  labels:
                    app: cache
                spec:
                  affinity:
                    podAntiAffinity:
                      requiredDuringSchedulingIgnoredDuringExecution:
                      - labelSelector:
                          matchExpressions:
                          - key: app
                            operator: In
                            values:
                            - cache
                        topologyKey: "kubernetes.io/hostname"
                  containers:
                  - name: redis-server
                    image: redis:3.2-alpine                                      
                  

This next example uses the podAffinity specs to try to align the pods with pods that have the app=cache label. This will ensure one cache pod and web pod are scheduled per node.

apiVersion: apps/v1
            kind: Deployment
            metadata:
              name: web
            spec:
              selector:
                matchLabels:
                  app: web
              replicas: 3
              template:
                metadata:
                  labels:
                    app: web
                spec:
                  affinity:
                    podAntiAffinity:
                      requiredDuringSchedulingIgnoredDuringExecution:
                      - labelSelector:
                          matchExpressions:
                          - key: app
                            operator: In
                            values:
                            - web
                        topologyKey: "kubernetes.io/hostname"
                    podAffinity:
                      requiredDuringSchedulingIgnoredDuringExecution:
                      - labelSelector:
                          matchExpressions:
                          - key: app
                            operator: In
                            values:
                            - cache
                        topologyKey: "kubernetes.io/hostname"
                  containers:
                  - name: web
                    image: nginx:1.16-alpine                                           
                  

When a Pod Cannot Be Scheduled

When configuring pod anti-affinity rules, no nodes may meet the requirements for scheduling. In this case, the pod enters a pending state. This section will review identifying these “pending” pods using the OpenShift command-line client oc. If you are unfamiliar with oc, review the documentation here.

When a pod is not schedulable, it will show a pending status, as shown below. In our example, we increased the replicas to four from three, so one of the pods could not be placed on a node. 

$ oc get pods
            NAME                               READY   STATUS    RESTARTS   AGE
            hello-openshift-7c669458bc-bf65x   0/1     Pending   0          4m5s
            hello-openshift-7c669458bc-cqllf   1/1     Running   0          4m5s
            hello-openshift-7c669458bc-gnbgd   1/1     Running   0          4m5s
            hello-openshift-7c669458bc-gv5lh   1/1     Running   0          4m5s

Running the command oc describe POD provides more information about the pod. The output will look similar to the output below.

Events:
            Type     Reason            Age                  From               Message
            ----     ------            ----                 ----               -------
            Warning  FailedScheduling  7m44s                default-scheduler  0/6 nodes are available: 3 node(s) didn't match pod anti-affinity rules, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.
            Warning  FailedScheduling  6m8s (x1 over 7m8s)  default-scheduler  0/6 nodes are available: 3 node(s) didn't match pod anti-affinity rules, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.                                      
                  

We can see why the pod cannot be scheduled in the output. Notice the result shows 0/6 nodes available. It notes three are the controller nodes (where workloads cannot be scheduled), and three do not match the pod anti-affinity rules. 

If we need to view the podAntiAffinity rules in place, we look at the pod definition.

$oc get pod hello-openshift-7c669458bc-bf65x -o yaml

The output is long, but you want to look for the affinity section similar to the snippet below.

spec:
        affinity:
          podAntiAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - hello-openshift
              topologyKey: kubernetes.io/hostname      
              

For our example, our pods will not be placed on a pod where the app=hello-openshift label is already present. We can view the pod labels as shown below. 

$ oc get pods -o wide --show-labels
            NAME                               READY   STATUS    RESTARTS   AGE    IP            NODE                                                  NOMINATED NODE   READINESS GATES   LABELS
            hello-openshift-7c669458bc-bf65x   0/1     Pending   0          3h2m                                                                                  app=hello-openshift,pod-template-hash=7c669458bc
            hello-openshift-7c669458bc-cqllf   1/1     Running   0          3h2m   10.120.1.36   worker-2                          app=hello-openshift,pod-template-hash=7c669458bc
            hello-openshift-7c669458bc-gnbgd   1/1     Running   0          3h2m   10.120.1.27   worker-1                          app=hello-openshift,pod-template-hash=7c669458bc
            hello-openshift-7c669458bc-gv5lh   1/1     Running   0          3h2m   10.120.1.93   worker-0                          app=hello-openshift,pod-template-hash=7c669458bc                                      
                  

Conclusion

Controlling where a pod is placed is a powerful OpenShift practice. Node affinity, pod affinity, and pod anti-affinity are three techniques for controlling pod placement. This article covered the use cases and configuration for pod anti-affinity.

The use cases and configuration examples here are a great starting point for OpenShift anti-affinity, but your needs may vary from cluster to cluster. You can use the YAML files from our examples as a baseline to customize your pod anti-affinity rules to meet advanced requirements. 

Like this article?

Subscribe to our LinkedIn Newsletter to receive more educational content

Subscribe now

Automated, Intelligent Container Sizing

Kubernetes Autoscaling doesn’t recommend pod limit values or consider I/O. Densify identifies mis-provisioned containers at a glance and prescribes the optimal configuration.

Densify has partnered with Intel to offer one year of free resource optimization software licensing to qualified companies.

Automated, Intelligent Container Sizing
  • Amazon Web Services
  • Microsoft Azure
  • Google Cloud Platform

Continue Reading this Series