uptime command shows high cpu load while atop shows cpu idle
Hi,
I am working in an embedded device using a debian based distribution. I'm monitoring the CPU load average using uptime command and atop. Sometimes I see the uptime 1min average going up a lot (up to 4 in a single processor board, A LOT!) but in atop (with interval 1) I don't see the huge CPU usage that the uptime command suggest (I see something like: CPU sys 7% user 4% irq 1% idle 88%) AFAIK, the percentages shown by atop correspond to: sys: processes in kernel space user: processes in user space irq: interrupts handling idle: doing nothing so the CPU is 88% idle but uptime suggests a huge load... Is there any kind of process in kernel that are not shown in atop? How can I trace them or how can I figure out where is the bottleneck if atop doesn't tell me who is the buggy process? I also tried vmstat command but it didn't give me any clue.. |
I've seen high load with low CPU usage in top before - in my case it's usually caused by I/O. If the drive starts lagging or pausing, the CPU usage drops since it's blocked waiting for the I/O, but the load doesn't change. So if you have four processes waiting on a hung/slow drive, CPU usage will be low while they wait, but load will be high, since all four processes are waiting on system resources.
From man uptime: Quote:
|
thnks for reply suicidaleggroll.
When a process is waiting for I/O the kernel preempt it and gives the CPU to another process, right? so, if a process is waiting another one should get the CPU for execution. Maybe I have many processes waiting for I/O.. but then I don't know why I don't see waits (wa column) or io in vmstat output. I see mainly 0s in that columns. Could it be that the RAM is buggy and the waits are there? thanks! |
Are you using multi-tasking? Are any of these processes waiting for queues or waiting for locks that another may have set ???
|
Unfortunately that quote from the manpage is at least misleading
Quote:
Quote:
s/is waiting/may be waiting/ This a major difference - especially in Linux where loadavg !== CPU load. Processes can be made uninterruptable and left there for some (hopefully short) time. Two of these add 2 to the loadavg - but have NO effect on CPU%. Apache is known to do this - likewise Oracle. Then, of course, you also have tasks waiting on I/O completition which also contribute to loadavg as mentioned. So you have to understand the workload, and what processes actually are in uninterruptable wait. |
All times are GMT -5. The time now is 04:50 PM. |