Data Collection for Containers

Container Prerequisites

#410140

Containers allow you to run an application and its dependencies in resource-isolated processes. They allow you to package an application's code, configuration, and dependencies into "building blocks". Container applications can be deployed quickly and reliably in any environment and also provide control over resources, so you only pay for the resources you use.

Densify can collect and analyze container data and then provide recommendations for optimizing your container-based applications.

Prerequisites

The following required configurations are necessary for Densify container optimization.

  1. Densify account. Contact Densify for details of your subscription or sign up for a free trial. See www.densify.com/service/signup.
  2. Kubernetes or OpenShift
    • Running cAdvisor as part of the kubelet that by default, provides the workload and configuration data required by Densify.
  3. Prometheus
  4. kube-state-metrics—This is a service that monitors the Kubernetes API server and generates metrics about the health of the various objects inside the individual Kubernetes components, such as deployments, nodes and pods.

The following item is not mandatory but does provide additional environment information for Densify container optimization.

  1. Node Exporter

Contact your Cloud Advisor for configuration details.

Configuring Container Data Collection

Data collection configuration consists of installing the Densify Data Forwarder container in your current container environment and configuring a connection between the Densify Data Forwarder and Prometheus to collect your container utilization metrics. You also need to configure a connection between the Data Forwarder container and your Densify server to send the collected metrics to be loaded and analyzed. The data collection frequency is configurable through the Data Forwarder container.

 

Data Forwarder Prerequisite Setup

Running the Data Forwarder

Download the forwarder from GitHub:

https://github.com/densify-dev/Container-Optimization-Data-Forwarder/tree/master/examples

Customize the configuration files. Refer to the ConfigMap or CronJob examples in GitHub to show you how to run and schedule the Data Forwarder container using yaml files:

  • Use the configmap.yml file to generate the config map, which contains the config.properties file.
  • Use the pod.yml file to create the pod.
  • Use the cronjob.yml file to schedule the container to send data to Densify each hour.

Note: If you want to configure the Data Forwarder container to use a specific version of the image, refer to https://hub.docker.com/r/densify/container-optimization-data-forwarder/tags for a list of all the available versions. Use the "latest" tag for the latest code in master. If you have issues with using the image from the "latest" tag, then pull the most recent release from the tags list.

To run the Densify Data Forwarder container, follow the process steps below:

  1. Update the following required properties in the config.properties file with your environment details:
    • host—Specify your Densify server host (e.g. myOptimization.densify.com).
    • user—Specify the Densify user that the Data Forwarder will use to authenticate with the Densify server. This user must already exist in Densify and have API access priviledges.
    • password/epassword—Specify the password or encrypted password for the Densify user.
    • prometheus_address—Specify the Prometheus address URL or IP used for the Data Forwarder connection.
    • prometheus_port—Specify the Prometheus port used for the Data Forwarder connection.
    • zipname—Specify the name you would like to use for the compressed utilization metrics file.

    Depending on your environment, you may need to configure additional parameters in the config.properties file. Contact your Cloud Advisor or support@Densify.com for the Densify user and epassword required to connect to your Densify instance.

  2. Run the Data Forwarder container using the updated config.properties file.
  3. Once you have verified that the Data Forwarder container is working, schedule it to run hourly.
  4. Densify recommends collecting data hourly. However, if this frequency is not appropriate for your environment, contact your Cloud Advisor to discuss your data collection frequency.

Obtaining the Data Forwarder

If you need to quarantine or manually pull the Data Forwarder image from Docker Hub to a local repository, you can obtain the container from the following location:

https://hub.docker.com/r/densify/container-optimization-data-forwarder

Use the following pull command:

docker pull densify/container-optimization-data-forwarder

You can copy this command directly from Docker. See the gray command example in the upper right corner of the web page.

Building and Configuring the Data Forwarder

If you need to build your own version of the Data Forwarder container (e.g. if your security policies do not allow you to pull containers from Docker Hub) then you can obtain the required code from GitHub:

https://github.com/densify-dev/Container-Optimization-Data-Forwarder

You can customize the code as required to conform to your security standards or reference your proprietary base images and then build your own version of the Data Forwarder container.

Container Concepts

You will need to familiarize yourself with the following terminology:

Term

Description

Pod

This is a group of containers deployed together on the same host. Pods can be stand-alone or controlled by Deployments or a Replication Controller. This data is collected and used by Densify

Request

The amount of resources being requested for a container. This acts like a reservation. You can request CPU and memory

Label

This is similar to a tag. Set labels to categorize your containers or identify your containers.

Limit

This is the upper cap on the resources that a container is able to use.

If sum total of all container resource limits exceeds the total resource capacity in the cluster there might be resource contention leading to throttling/evictions

Namespaces

Kubernetes supports multiple virtual clusters backed by the same physical cluster. These virtual clusters are called namespaces

Resource Quota

A resource quota provides constraints that limit aggregate resource consumption per namespace (CPU & Memory requests and their limits) and the quantity of objects to create.

In the case where the total capacity of the cluster is less than the sum total of the quotas of the namespaces, resource contention is handled on a first-come-first-served basis.

Setting the quota requires that every incoming container specifies explicit requirements.