What can be the cause of this issue?A very good explanation about ESXi CPU Scheduling can be found here: https://www.youtube.com/watch?v=8jeBIvzyB80
It explains how the hypervisor ESXi schedules the physical CPUs to the virtual vCPUs. And here is the issue. For example:
|Picture from YouTube Video "The vSphere CPU Scheduler" of "TrainerTests"|
This can be monitored by monitoring the following CPU metrics:
Which ESXi metrics should be monitored?
· %USED tells you how much time did the virtual machine spend executing CPU cycles on the physical CPU.
· %RDY (should be very low) is a very important performance indicator. Start with this one. This one defines how much time your virtual machine wanted to execute CPU cycles but could not get access to the physical CPU. It tells you how much time did you spend in a “queue”. Expect this value to be better than 5% (this equals 1000ms in the vCenter Performance Graphs read about it here)
· %CSTP (should be 0.0%) tells you how much time a virtual machine is waiting for a virtual machine with multiple vCPU to catch up. If this number is higher than 1% you should consider lowering the amount of vCPU in your virtual machine.
· %WAIT It is the percentage of time the virtual machine was waiting for some VMkernel activity to complete (such as I/O) before it can continue.
· %IDLE (should be high) The percentage of time a world is in idle loop.
👉 High %CSTP indicates that ESXi is asking the vCPUs to wait – A.K.A. co-stopping them for scheduling purposes -> decrease vCPUs of VM