Anti-Affinity OpenShift: Tutorial & Instructions

Guide to OpenShift Tutorial
calendar January 11, 2023
Chapter 9 Anti-Affinity OpenShift: Tutorial & Instructions
In OpenShift, part of deploying a pod is deciding where to schedule the pod. Pod scheduling is handled by the openshift-kube-scheduler pods running on each controller node. By default, any workload can be placed on any worker node. However, administrators may want to control pod placement. There are three common techniques to solve this problem: node affinity, pod affinity, and pod anti-affinity. OpenShift Pod Placement Techniques 
Technique Description
Node Affinity Schedule pods where the node matches specific labels.
Pod Affinity Schedule pods on nodes where matching pod labels already exist.
Pod Anti-Affinity Avoid scheduling pods where matching pod labels already exist.
This article will cover pod anti-affinity, which is the practice of scheduling pods while avoiding other pods based on labels. We will review the most common anti-affinity OpenShift use cases and the pod specs required to configure pod anti-affinity. We will also review what happens when a pod fails to schedule because of pod anti-affinity rules.

Anti-affinity OpenShift Use Cases

Before discussing how to configure pod anti-affinity, let us discuss why you might want such specific control over pod scheduling. While specific requirements vary depending on  infrastructure, available resources, and workload needs, the two most common anti-affinity OpenShift use cases are high availability and pod pairing. Anti-affinity OpenShift Use Cases
Use Case Description
High Availability Ensure multiple pods are not running on the same node.
Pod Pairing In some cases, pod anti-affinity rules can be combined with pod-affinity rules to pair specific sets of pods together and still distribute them across the cluster.

High Availability

Pods may need to be on different nodes to ensure high availability for databases or caching applications. In these cases, the anti-affinity rules would be configured to avoid pods where the application is already running. One example is configuring one elastic search pod per OpenShift cluster node. Another example is Percona including anti-affinity in its operator.

Machine learning for Kubernetes sizing

Learn More
Visualize Utilization Metrics Set Resource Requests & Limits Set Requests & Limits with Machine Learning Identify mis-sized containers at a glance & automate resizing Get Optimal Node Configuration Recommendations
Kubernetes
Kubernetes + Densify

Pod Pairing

Sometimes, it is beneficial to pair pods. For example, you may want your Python application paired with a Redis cache, but you only want one cache pod and one application pod per node. In that case, using both pod affinity and pod anti-affinity is ideal.

Configuring Pod Anti-Affinity in OpenShift

Now that we know the why, let’s move on to the how. This section will review the files required to configure pod anti-affinity in OpenShift/ We will use Deployments to configure and launch our pods for our examples. If you are unfamiliar with Deployments, consult the OpenShift documentation. This first example is the configuration lines that specify the pod anti-affinity section.

                [1] affinity:
                [2]  podAntiAffinity:
                [3]    requiredDuringSchedulingIgnoredDuringExecution:
                [4]    - labelSelector:
                [5]        matchExpressions:
                [6]        - key: app
                [7]          operator: In
                [8]          values:
                [9]          - hello-openshift
                [10]      topologyKey: "kubernetes.io/hostname"                             
            
Line # Content Description
1 affinity: This is the start of the affinity section. Part of the pod specs.
2 podAntiAffinity: Specifying we are configuring pod anti-affinity
3 requiredDuringSchedulingIgnoredDuringExecution: The option "requiredDuringSchedulingIgnoredDuringExecution" is long but breaks down to the following two points. 1. Do not schedule the pod unless the rules are met 2. If the node labels change after the pod is running, the pod can still run. The option “preferredDuringSchedulingIgnoredDuringExecution” will try to match the rules, but it will still be scheduled if no node matches.
4 - labelSelector: Configure label selection.
5 matchExpressions: Setup the labels we need to match
6 - key: app The label we want to compare, in this case, app
7 operator: In How we want to compare the label, in this case, match the value.
8 values: List of values to match.
9 - hello-openshift The value to match. In this case, “hello-openshif”t. For multiple values, any of them can match.
10 topologyKey: "kubernetes.io/hostname" Similar to nodeSelector, the topologyKey matches node labels. In this case, we are picking a label that should be on all nodes by default. This ensures all the nodes are in scope for selection.
So to review, this configuration section states that the pod should not be scheduled on a node where a pod has the label app=hello-openshift. Below is an example of the deployment as a complete file. This file creates a deployment and launches pods with the app=hello-openshift label.

Identify under/over-provisioned K8s resources and use Terraform to auto-optimize

WATCH 3-MIN VIDEO
The podAntiAffinity rules will dictate that if a pod with the app=hello-openshift label is already on the node, it will not place another pod on that node.
apiVersion: apps/v1
                    kind: Deployment
                    metadata:
                      name: hello-openshift
                    spec:
                      replicas: 1
                      selector:
                        matchLabels:
                          app: hello-openshift
                      template:
                        metadata:
                          labels:
                            app: hello-openshift
                        spec:
                            affinity:
                            podAntiAffinity:
                              requiredDuringSchedulingIgnoredDuringExecution:
                              - labelSelector:
                                  matchExpressions:
                                  - key: app
                                    operator: In
                                    values:
                                    - hello-openshift
                                topologyKey: "kubernetes.io/hostname"
                          containers:
                          - name: hello-openshift
                            image: openshift/hello-openshift:latest
                            ports:
                            - containerPort: 80                          
                          
Notice here we only have one replica configured. Our configuration will only allow one pod per node. If we increase the replicas to four, one pod will not be scheduled due to the rules. We’ll review what happens in that case in the When a Pod Cannot Be Scheduled section below. This next example combines pod affinity and pod anti-affinity together to pair pods together. This Deployment file launches three Redis caching pods on nodes where the key/label app=cache does not already exist. Effectively one pod per node.
apiVersion: apps/v1
                    kind: Deployment
                    metadata:
                      name: cache
                    spec:
                      selector:
                        matchLabels:
                          app: cache
                      replicas: 3
                      template:
                        metadata:
                          labels:
                            app: cache
                        spec:
                          affinity:
                            podAntiAffinity:
                              requiredDuringSchedulingIgnoredDuringExecution:
                              - labelSelector:
                                  matchExpressions:
                                  - key: app
                                    operator: In
                                    values:
                                    - cache
                                topologyKey: "kubernetes.io/hostname"
                          containers:
                          - name: redis-server
                            image: redis:3.2-alpine                                      
                          
This next example uses the podAffinity specs to try to align the pods with pods that have the app=cache label. This will ensure one cache pod and web pod are scheduled per node.

Learn the “80-20 rule” applied to Kubernetes optimization

Watch Free Video
apiVersion: apps/v1
                    kind: Deployment
                    metadata:
                      name: web
                    spec:
                      selector:
                        matchLabels:
                          app: web
                      replicas: 3
                      template:
                        metadata:
                          labels:
                            app: web
                        spec:
                          affinity:
                            podAntiAffinity:
                              requiredDuringSchedulingIgnoredDuringExecution:
                              - labelSelector:
                                  matchExpressions:
                                  - key: app
                                    operator: In
                                    values:
                                    - web
                                topologyKey: "kubernetes.io/hostname"
                            podAffinity:
                              requiredDuringSchedulingIgnoredDuringExecution:
                              - labelSelector:
                                  matchExpressions:
                                  - key: app
                                    operator: In
                                    values:
                                    - cache
                                topologyKey: "kubernetes.io/hostname"
                          containers:
                          - name: web
                            image: nginx:1.16-alpine                                           
                          

When a Pod Cannot Be Scheduled

When configuring pod anti-affinity rules, no nodes may meet the requirements for scheduling. In this case, the pod enters a pending state. This section will review identifying these “pending” pods using the OpenShift command-line client oc. If you are unfamiliar with oc, review the documentation here. When a pod is not schedulable, it will show a pending status, as shown below. In our example, we increased the replicas to four from three, so one of the pods could not be placed on a node.
$ oc get pods
                    NAME                               READY   STATUS    RESTARTS   AGE
                    hello-openshift-7c669458bc-bf65x   0/1     Pending   0          4m5s
                    hello-openshift-7c669458bc-cqllf   1/1     Running   0          4m5s
                    hello-openshift-7c669458bc-gnbgd   1/1     Running   0          4m5s
                    hello-openshift-7c669458bc-gv5lh   1/1     Running   0          4m5s
Running the command oc describe POD provides more information about the pod. The output will look similar to the output below.
Events:
                    Type     Reason            Age                  From               Message
                    ----     ------            ----                 ----               -------
                    Warning  FailedScheduling  7m44s                default-scheduler  0/6 nodes are available: 3 node(s) didn't match pod anti-affinity rules, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.
                    Warning  FailedScheduling  6m8s (x1 over 7m8s)  default-scheduler  0/6 nodes are available: 3 node(s) didn't match pod anti-affinity rules, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.                                      
                          
We can see why the pod cannot be scheduled in the output. Notice the result shows 0/6 nodes available. It notes three are the controller nodes (where workloads cannot be scheduled), and three do not match the pod anti-affinity rules. If we need to view the podAntiAffinity rules in place, we look at the pod definition.
$oc get pod hello-openshift-7c669458bc-bf65x -o yaml
The output is long, but you want to look for the affinity section similar to the snippet below.
spec:
                affinity:
                  podAntiAffinity:
                    requiredDuringSchedulingIgnoredDuringExecution:
                    - labelSelector:
                        matchExpressions:
                        - key: app
                          operator: In
                          values:
                          - hello-openshift
                      topologyKey: kubernetes.io/hostname      
                      
For our example, our pods will not be placed on a pod where the app=hello-openshift label is already present. We can view the pod labels as shown below.
$ oc get pods -o wide --show-labels
                    NAME                               READY   STATUS    RESTARTS   AGE    IP            NODE                                                  NOMINATED NODE   READINESS GATES   LABELS
                    hello-openshift-7c669458bc-bf65x   0/1     Pending   0          3h2m                                                                                  app=hello-openshift,pod-template-hash=7c669458bc
                    hello-openshift-7c669458bc-cqllf   1/1     Running   0          3h2m   10.120.1.36   worker-2                          app=hello-openshift,pod-template-hash=7c669458bc
                    hello-openshift-7c669458bc-gnbgd   1/1     Running   0          3h2m   10.120.1.27   worker-1                          app=hello-openshift,pod-template-hash=7c669458bc
                    hello-openshift-7c669458bc-gv5lh   1/1     Running   0          3h2m   10.120.1.93   worker-0                          app=hello-openshift,pod-template-hash=7c669458bc                                      
                          

Free Proof of Concept implementation if you run more than 5,000 containers

Request PoC

Conclusion

Controlling where a pod is placed is a powerful OpenShift practice. Node affinity, pod affinity, and pod anti-affinity are three techniques for controlling pod placement. This article covered the use cases and configuration for pod anti-affinity. The use cases and configuration examples here are a great starting point for OpenShift anti-affinity, but your needs may vary from cluster to cluster. You can use the YAML files from our examples as a baseline to customize your pod anti-affinity rules to meet advanced requirements.
Like this article?
Subscribe to our LinkedIn Newsletter to receive more educational content Subscribe now

Discover the benefits of optimized cloud & container resources. Try Densify today!

Request a Demo