Managing capacity and overcommitting in Red Hat OpenShift
In this blog, we will learn about managing capacity and overcommitting in Red Hat OpenShift.
Managing capacity and over-commitment in Red Hat OpenShift may appear intricate; however, grasping several fundamental concepts can simplify the process. This document provides an overview of essential information regarding pod requests, limits, and best practices for their configuration, as well as the role each aspect plays in efficient capacity management and over-commitment.
Request for a pod
A pod request refers to the minimum quantity of computing resources, including memory or CPU, that you designate as necessary for the operation of your container. For instance, if you establish a memory request of 1 Gi, the scheduler guarantees that a minimum of 1 Gi of memory is accessible for your pod prior to its allocation to a node.
Capacity management advantage: Guarantees that critical resources are allocated for each pod, thereby avoiding resource deficiencies and ensuring that all pods possess the necessary minimum resources required for efficient operation.
Maximum capacity for pods.
A pod limit specifies the upper boundary of resources that a pod is permitted to utilize. For instance, if a memory limit of 2 Gi is established, the pod is allowed to use a maximum of 2 Gi of memory, but it cannot exceed this amount. This restriction is implemented by the kernel through cgroups, which are designed to prevent any single pod from dominating resources and consequently affecting the performance of other pods.
Capacity management advantage: Safeguards against the excessive utilization of resources by individual pods, thereby ensuring an equitable allocation of resources among all active pods.
Excessive commitment.
Overcommitment occurs when the allocated resources surpass the specified request. If a pod is designated with a memory request of 1 Gi and a limit of 2 Gi, it will be scheduled based on the 1 Gi request, while having the ability to use as much as 2 Gi. Consequently, the pod is considered overcommitted by 200%, as it is able to access double the memory that was initially guaranteed.
Capacity management advantage: Facilitates the more effective use of cluster resources by permitting pods to access extra resources when they are available, without providing a guarantee for those resources.
Perform practical load simulations on your pods.
Requests and limits play a crucial role in maintaining the efficient and predictable operation of your Red Hat OpenShift instance. Failing to establish these parameters can lead to negative outcomes.
Resources are not guaranteed.
If resource requests are not established, the scheduler cannot assure a specific allocation of CPU or memory for your pods. This may result in suboptimal performance or potentially cause pod failures when the node experiences significant load.
The use of resources is unrestricted.
A container operating without restrictions can utilize an unlimited amount of CPU and memory resources. This situation may result in resource starvation, where a single container consumes all available resources, thereby leading to the failure or eviction of other containers.
Capacity management advantage: Establishing both requests and limits facilitates equitable resource distribution, thereby avoiding under-provisioning (resource starvation) as well as over-provisioning (resource hogging).
Best practices for establishing requests and limits.
There are five fundamental best practices to adhere to when establishing requests and limits.
- It is essential to consistently establish memory and CPU requests.
- It is advisable to refrain from establishing CPU limits, as this may result in throttling.
- Keep track of your workload. Establish requests according to the average utilization observed over a period.
- Establish memory constraints proportional to the scale factor of the request.
- Utilize the Vertical Pod Autoscaler (VPA) to optimize and modify these parameters as time progresses.
Capacity management advantage: These strategies guarantee that every pod receives the necessary resources without the risk of over-allocation, resulting in optimal resource utilization and enhanced performance of the cluster.
Implementing the Vertical Pod Autoscaler (VPA) to ensure efficient allocation of resources.
The Vertical Pod Autoscaler (VPA) feature within Red Hat OpenShift modifies the CPU and memory allocation for a pod when additional resources are necessary. When utilizing VPA, it is important to consider the following recommendations.
- Set up and configure the VPA exclusively in Recommendation mode.
- Perform practical load simulations on your pods.
- It is essential to adhere to the suggested values and modify the pod resources as necessary.
What is the rationale for exclusively utilizing the Recommendation mode?
When VPA is configured to Automatic mode, pods are restarted to align with the suggested values. The in-place VPA feature, which does not involve restarts, is currently in the alpha stage as of Red Hat OpenShift 4.16.
Modifying the viewing duration for those who provide recommendations.
VPA facilitates the use of personalized recommenders, enabling you to configure watch durations of 1 day, 1 week, or 1 month according to your requirements. For further information, please refer to the section on automatically adjusting pod resource levels with the vertical pod auto scaler.
Capacity management advantage: VPA facilitates the dynamic adjustment of resource requests and limits in accordance with real usage patterns. This approach guarantees optimal resource distribution and reduces the risk of overcommitment.
System-reserved resources in Red Hat OpenShift can be rephrased as resources allocated for system use within Red Hat OpenShift.
A resource may be classified as system-reserved. This indicates that Red Hat OpenShift sets aside a segment of node resources, including CPU and memory, for essential system processes such as the kubelet and container runtime. This approach offers numerous advantages.
- Allocates specific resources for system processes, thereby avoiding conflicts with application workloads.
- Enhances the stability and performance of nodes by preventing resource depletion for critical system services.
- Ensures consistent functionality and dependable performance of the cluster.
Automatic resource allocation for nodes can be activated by adhering to the guidelines provided for self-managed OpenShift clusters in the OpenShift documentation. In the case of a managed OpenShift instance, such as Red Hat OpenShift Service on AWS (ROSA), this process is handled on your behalf.
Effective capacity management entails the strategic allocation of resources for system processes, which guarantees the seamless operation of critical services. This approach mitigates the risk of application performance disruptions that may arise from competition for system-level resources.
Cluster auto-scaling mechanism.
A cluster auto scaler autonomously adjusts the number of nodes by adding or removing them as required. This functionality operates in tandem with the Horizontal Pod Autoscaler (HPA). For further information, please refer to the OpenShift Cluster Autoscaler guide and the OpenShift documentation on autoscaling.
The advantage of capacity management lies in the functionality of the cluster auto scaler, which guarantees that your cluster maintains an appropriate number of nodes to accommodate the existing workload. It automatically adjusts the number of nodes, scaling up or down as required. This process aids in achieving optimal resource utilization and enhancing cost efficiency.
ClusterResourceOverride operator (CRO) can be rephrased as the Cluster Resource Override operator.
The ClusterResourceOverride operator facilitates the optimization of resource distribution, promoting effective and equitable utilization throughout the cluster.
Sample Configuration:
- Requested CPU: 100 millicores (equivalent to 0.1 cores).
- Requested Memory: 200 Megabytes (MiB).
- CPU Limit: 200 millicores (equivalent to 0.2 cores).
- Memory Capacity: 400 MiB.
Modifications.
- CPU Request Override: 50 percent of the requested allocation.
- Memory Request Override: 75 percent of the requested amount.
- CPU Limit Override: Twice the initial request.
- Memory Limit Override: Twice the initial request.
Modified resources:
- Requested CPU: 100 millicores multiplied by 50% equals 50 millicores, which is equivalent to 0.05 cores.
- Requested Memory: 200 MiB multiplied by 75% equals 150 MiB.
- CPU Limit: 50 millicores multiplied by 2 equals 100 millicores, which is equivalent to 0.1 cores.
- Memory Capacity: 150 MiB multiplied by 2 equals 300 MiB.
For additional information, please refer to the section on Cluster-level overcommit utilizing the Cluster Resource Override Operator.
Capacity management advantage: By adjusting the standard resource requests and limits, you can guarantee that resources are distributed effectively, thereby avoiding both underutilization and overcommitment.
Scalability framework.
The scalability envelope can be conceptualized as a higher-dimensional cube. Remaining within this envelope ensures that your performance Service Level Objectives (SLO) are achieved, allowing your Red Hat OpenShift cluster to operate effectively. However, as you progress along one dimension, the capacity in other dimensions diminishes. The OpenShift dashboard serves as a tool to observe your Green zone, which indicates a safe area for scaling your cluster objects, and the Red zone, which signifies the limits beyond which scaling should be avoided.
Capacity management advantage: Comprehending and functioning within the scalability parameters aids in guaranteeing that your cluster operates reliably under fluctuating workloads, thereby avoiding resource bottlenecks and maintaining consistent performance.
Automatic scaling of pods.
Multiple methods exist for autoscaling pods in Red Hat OpenShift. While we have previously covered Vertical Pod Autoscaling (VPA), additional strategies also exist.
Horizontal Pod Autoscaling (HPA)
HPA improves the scalability of pods by augmenting the number of replicas. This mechanism is particularly beneficial for stateless applications operating in production settings, as it optimizes application performance and availability by effectively managing load and preventing Out-of-memory (OOM) termination. For further information, please refer to the documentation on automatically scaling pods with the horizontal pod auto scaler.
Custom Metric Autoscaler
This tool adjusts the number of pods according to metrics defined by the user, making it appropriate for diverse settings such as production, testing, and development. It enhances application availability and performance by observing and scaling in response to particular stress factors. For further information, please refer to the overview of the Custom Metrics Autoscaler operator.
Autoscaling in response to workload demand provides a significant advantage in capacity management by guaranteeing that your applications are equipped with the requisite resources to accommodate fluctuating loads, thereby improving both performance and resource utilization.
OpenShift scheduling system.
The LowNodeUtilization profile of OpenShift’s scheduler distributes pods uniformly across nodes to ensure minimal resource consumption on each node. The advantages include:
- Cost efficiency in cloud environments can be attained by reducing the required number of nodes.
- Enhanced distribution of resources throughout the cluster.
- Energy efficiency within data centers.
- Improved efficiency is achieved by avoiding node overload.
- Mitigation of resource depletion through the equitable distribution of workloads.
For further details, kindly consult the section regarding the scheduling of pods using a scheduler profile.
Capacity management advantage: The scheduler facilitates an equitable allocation of resources, which helps prevent the occurrence of hotspots and the underutilization of nodes, resulting in a more balanced and efficient cluster.
OpenShift descheduler
The AffinityAndTaints profile removes pods that do not comply with inter-pod anti-affinity, node affinity, and node taints. The advantages include:
- Adjusting inadequate pod positioning.
- Implementing node affinities and anti-affinities.
- Adjusting to modifications in node taints to guarantee that only suitable pods are retained on the node.
For further information, please refer to the document on Evicting pods utilizing the descheduler.
The advantage of capacity management lies in the descheduler’s ability to sustain optimal pod placement over time. It adjusts to fluctuations within the cluster, thereby ensuring the efficient utilization of resources while adhering to affinity and taint constraints.
By adhering to these recommended practices and utilizing the resources provided by OpenShift, you can proficiently oversee capacity and overcommitment, thereby guaranteeing the seamless and efficient operation of your applications.
Red Hat Advanced Cluster Management Optimization for Resource Allocation.
Red Hat Advanced Cluster Management for Kubernetes — Right Sizing has been released as an improved developer preview. The objective of RHACM Right Sizing is to offer platform engineering teams recommendations at the namespace level based on CPU and memory usage. This feature is currently supported by Prometheus recording rules, enabling the application of maximum and peak value logic over various aggregation periods, including 1, 2, 5, 10, 30, 60, and 90 days.
The advantages of employing RHACM Right Sizing encompass the following:
- Determine the primary resource offenders, such as regions that contribute to underutilization.
- Encourage openness throughout your organization and initiate pertinent discussions.
- Enhance fleet management by utilizing RHACM, which facilitates cost efficiency and resource optimization regardless of the number of managed clusters you are required to deploy.
- A streamlined user navigation is offered through a specialized Grafana dashboard, which is integrated into the RHACM console.
Capacity management advantage: The Right Sizing functionalities of RHACM enable platform engineers to obtain recommendations for CPU and memory right sizing, which are presented in a specialized Grafana dashboard. This feature is integrated within the Red Hat Advanced Cluster Management for Kubernetes console, providing users with essential recommendations based on different aggregation periods, including extended time frames, as utilization fluctuates over time. Consequently, these recommendations are designed to be user-friendly and readily accessible (refer to the figures below).