Set the container quota and period to 400ms and 100ms based on the observed CPU utilization of 250% in the granularity of second. If we observe the container load operation at the same time with 1s and 100ms granularity, the second level of CPU utilization averages about 250%, and the peak value of observed CPU utilization in the level of 100ms that Bandwidth Controller works on has exceeded 400%. When the granularity gets finer, the burst feature of CPU usage is more obvious. CPU utilization tends to be stable with coarse granularity to record CPU usage requirements. When the CPU usage of a 100ms cycle exceeds 50ms, the process is throttled, and cgroup CPU usage is limited to 50%.ĬPU utilization is the average CPU usage over time. If the cgroup period is 100ms and its quota is 50ms, the cgroup process uses a maximum of 50ms of CPU time every 100ms. It uses period and quota to manage the CPU time consumption of cgroup. Bandwidth Controller is suitable for CFS tasks. This is the cause of unexpected CPU throttling. The second-level CPU utilization does not reflect the usage of the 100ms-level CPU that Bandwidth Controller works on. The CPU Utilization You See Is Not the Whole Truth Allow resource owners to reduce CPU resource configuration and improve CPU resource utilization without sacrificing resource service quality.Improve the service quality of CPU resources without improving CPU configuration.In cloud computing scenarios, the value of CPU Burst includes: We proposed CPU Burst technology to allow certain CPU bursts use to avoid throttling when the average CPU utilization is lower than the CPU limit. The current unexpected throttling is caused by the burst CPU use of 100ms. In the past, people fixed some CPU throttling problems caused by bugs in CPU Bandwidth Controller. Therefore, the deployment density of containers must be reduced to cope with possible container CPU usage peaks. However, the total CPU utilization of the container is only 10% to 20%. In many cases, when the CPU limits of the container are enlarged 5 to 10 times, the service quality of the container is guaranteed more. What should we do? It seems we can only continue to increase CPU limits to solve it. However, is it perfect? It is not! CPU throttling occurs much more frequently than expected. As such, the CPU utilization of the container is 62.5% (250%/400%). If we have a container whose daily peak CPU utilization is around 250%, we set the container CPU limits to 400% to ensure the container service quality. This way, we can avoid the deterioration of service quality of the container caused by throttling and ensure the utilization of CPU resources. What should we do with this kind of situation? Usually, we multiply the daily peak of CPU utilization of the container by a safety factor to set the CPU limits of the container. The CPU time they use will be limited, and some key latency indicators in these processes will deteriorate. Therefore, when processes in a container use more resources than what CPU limits specify, these processes will be throttled by the CPU. It limits the resource consumption of cgroups through CPU throttling. In the Linux kernel, CPU limits are implemented with CPU Bandwidth Controller. We can limit the excessive CPU run time consumed by some containers and ensure that other containers get enough CPU resources by setting the upper limit of CPU resources. In Kubernetes container scheduling, the upper limit of containers' CPU resources is specified by the CPU limits parameter. ![]() Anolis OS 8.2, Alibaba Cloud Linux 2, and Alibaba Cloud Linux 3 also support the CPU Burst feature. The CPU Burst feature has been incorporated into Linux 5.14. The third part will introduce evaluating impacts of CPU Burst, and discuss how to configure CPU Burst to achieve the best results. The second part will analyze the size effects of CPU Burst. This article is divided into three parts. The CPU Burst technology we designed can guarantee the service quality of container operation without reducing the density of container deployment. Sometimes, people have to sacrifice container deployment density to avoid the occurrence of CPU throttling. The annoying CPU throttling affects container operation. By Huaixin Chang from Alibaba Cloud, core member of Cloud Kernel SIG in the OpenAnolis community.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |