Occasional, unexplainable load average rocketing

mk27 · 06-13-2011, 10:21 AM

I've been a linux user for more than a decade. My load average is almost always low (>0.2). However, lately I have been having a problem with the average suddenly soaring up to 6-8+, which completely locks the system.

This lasts a minute or two, and happens once or twice a day. I cannot find any explanation for it -- in top, there is no evidence of sudden spawning, etc. Ie, this is not due to a real increase in the number of processes.

Which implies to me something strange is going on with the kernel run queue, but that is only a slightly educated guess.

Anyone know how I can track this down?

amani · 06-13-2011, 11:58 AM

see/maintain logs.

selinux is known to cause load increase in Fedora (recent) ...for some applications

mk27 · 06-13-2011, 12:27 PM

There's nothing in the logs at all. Mebbe I'll up the kernel log level to debug.

Valery Reznic · 06-13-2011, 11:41 PM

Quote:

Originally Posted by mk27

There's nothing in the logs at all. Mebbe I'll up the kernel log level to debug.

May be you have something in crontab?

ssrameez · 06-14-2011, 12:41 AM

I would suggest to put a small script for taking the snapshot of the system, which can run every 2 minutes.
The script can capture.
vmstat
top
ps -eaf
Let this go to some log files with the time stamp.

Analyze the files two or three days, and find out the culprit process.

mk27 · 06-14-2011, 05:47 AM

Quote:

Originally Posted by ssrameez

I would suggest to put a small script for taking the snapshot of the system, which can run every 2 minutes.

I have had top running before when it's happening, and there are no clues there.

I can't use a script, because that will involve new processes. One of the symptoms is that the active process is frozen for the duration (whatever window I'm working in when it starts). Other windows/existing processes are usable (eg, the browser), but you cannot start another process (eg, a terminal "works", but any command you issue must wait until the event is over). So a single process that can do the necessary monitoring might work, but writing that is not a minor task.

However, changing the configuration of rlogd got me the appropriate kernel output in /var/messages:

Code:

Jun 14 06:28:34 kernel: [ 3479.708078] ata4: lost interrupt (Status 0x50)
Jun 14 06:28:34 kernel: [ 3479.708100] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Jun 14 06:28:34 kernel: [ 3479.708108] ata4.00: failed command: WRITE DMA
Jun 14 06:28:34 kernel: [ 3479.708118] ata4.00: cmd ca/00:88:4f:02:ec/00:00:00:00:00/ec tag 0 dma 69632 out
Jun 14 06:28:34 kernel: [ 3479.708120]          res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Jun 14 06:28:34 kernel: [ 3479.708125] ata4.00: status: { DRDY }
Jun 14 06:28:34 kernel: [ 3479.708139] ata4: soft resetting link
Jun 14 06:28:34 kernel: [ 3479.868631] ata4.00: configured for UDMA/133
Jun 14 06:28:34 kernel: [ 3479.868641] ata4.00: device reported invalid CHS sector 0
Jun 14 06:28:34 mint kernel: [ 3479.868656] ata4: EH complete

This pattern repeats every ~30s for 5 minutes, during which the lock-up is constant. I've had the same event cause boot failures and application crashes lately, it's a hard drive failure. I can't see the kernel itself depending on disk writes, but obviously something bad potentially happens when an active process gets caught in this.

Hopefully it's just some bad blocks...

syg00 · 06-14-2011, 07:36 AM

Quote:

Originally Posted by mk27

I can't see the kernel itself depending on disk writes, but obviously something bad potentially happens when an active process gets caught in this.

Oh yeah ? Where is the kernel loaded from - and where is /var/messages ?.

As for your loadavg problem, that is almost a classic disk error situation. But the loadavg is a symptom, not the problem itself. Get a new disk.

mk27 · 06-14-2011, 08:44 AM

Quote:

Originally Posted by syg00

Oh yeah ? Where is the kernel loaded from

A compressed image on disk, but once it is loaded, it's all in RAM.

Quote:

- and where is /var/messages ?.

1) Until I reconfigured rlogd there was no file logging at all during the event.

2) The kernel does not log to disk anyway; it uses printk to the console, this is captured by a userspace tool (rlogd, or whatever). AFAIK the kernel does not do any disk I/O at all except for swap, which my swap is not active, and on behalf of userland, which is not critical to its functioning (userland needs the kernel, the kernel does not need userland).

3) The logging of the error presumes the error has occurred, so logging the error cannot be the cause of the error.

My point is, if this failure happens because of a disk write by a userspace application (which seems to be the case) it should not, IMO, cause craziness with the kernel run queue.

Quote:

As for your loadavg problem, that is almost a classic disk error situation.

Thanks for confirming that. Still curious as to why/how a disk error would lead to a loadavg problem, tho. The fact that it reports exactly 10 times 30 seconds apart implies to me there is some intentional error handling that compensates for the issue in the end.

syg00 · 06-14-2011, 08:54 AM

It has nothing too do with the run queue - that's Unix thinking, not Linux.
Linux loadavg comprises runable tasks plus those in uninterruptable sleep. Usually (but not exclusively) tasks waiting on disk I/O.
Usually that doesn't matter one iota to other tasks. But if kernel threads get hung up (and they can - even kswapd and the bdi's) then you are history until the outstanding I/O clears. If it doesn't clear, goodbye ...

mk27 · 06-14-2011, 09:22 AM

Quote:

Originally Posted by syg00

It has nothing too do with the run queue - that's Unix thinking, not Linux.
Linux loadavg comprises runable tasks plus those in uninterruptable sleep.

Not to get too picky, lol, but "runable tasks" are the run queue (that's what it's called in the linux kernel scheduler), so it has everything (as opposed to "nothing") to do with it.

It's a little hard to believe that sleeping processes contribute anything to the load average, you'll have to give me a source for that because it is oxymoronic (sleeping process do not use the CPU). Sleeping process do have a load weight akin to the "nice" value, which determines their priority if they re-enter the run queue, but load average is about actual (not potential) activity. Load average will affect load weight, but not vice versa.

Maybe you should read:

http://www.linuxjournal.com/article/9001
http://luv.asn.au/overheads/NJG_LUV_2002/luvSlides.html
et. al.

syg00 · 06-14-2011, 06:47 PM

Quote:

Originally Posted by mk27

It's a little hard to believe that sleeping processes contribute anything to the load average, you'll have to give me a source for that because it is oxymoronic (sleeping process do not use the CPU).

"man proc" and look for loadavg.
The first of your links is reasonably good - I have referred people to it myself. You'll note it also refers to uninterruptible, but only obliquely - and not strictly correctly.
The second link is for Unix, not Linux.

mk27 · 06-15-2011, 07:28 AM

Quote:

Originally Posted by syg00

"man proc" and look for loadavg.

By coincidence, I'm working on a process logger, so I've been looking at that page quite a bit. Here's the part you reference, but don't cite...

Quote:

Originally Posted by man proc (for kernel 2.6+)

/proc/loadavg
The first three fields in this file are load average figures giving the number of jobs in the run queue (state R) or waiting for disk I/O (state D) averaged over 1, 5, and 15 minutes. They are the same as the load average numbers given by uptime(1) and other programs. The fourth field consists of two numbers separated by a slash (/). The first of these is the number of currently executing kernel scheduling entities (processes, threads); this will be less than or equal to the number of CPUs. The value after the slash is the number of kernel scheduling entities that currently exist on the system. The fifth field is the PID of the process that was most recently created on the system.

Completely unequivocable and unambiguous: the load average is "the number of jobs in the run queue (state R) or waiting for disk I/O (state D) averaged over 1, 5, and 15 minutes".

I don't see anything about sleeping processes (state S) here, because, once again, that would make no sense.

[edit: state D is uninterruptable sleep, keep reading if you care ]

Quote:

The first of your links is reasonably good - I have referred people to it myself. You'll note it also refers to uninterruptible, but only obliquely - and not strictly correctly.

No, it does not do so even obliquely. This has nothing to do with sleeping processes. Honestly.

Quote:

The second link is for Unix, not Linux.

It says, quite clearly in the title: Linux Load Average. If you actually read it, you might notice this is derived from the author's input into the development of the CFS scheduler in the linux kernel, which is what manages the run queue.

sundialsvcs · 06-15-2011, 08:29 AM

It sounds like some kind of mutex or application-deadlock problem. If commands can be typed at the keyboard, etc, then the basic operating system is working just fine. It could be an XWindows interface issue. In other words, "the system is not hung ... the workloads are waiting on something, and probably timing out." The clues are probably being logged, all right, just not in the logs that you're looking at.

What happens, for example, if you do the ol' Ctrl+Alt+F4 thing to move to an actual terminal-window, bypassing the windowed interface completely?

mk27 · 06-15-2011, 08:49 AM

Quote:

Originally Posted by sundialsvcs

It sounds like some kind of mutex or application-deadlock problem. If commands can be typed at the keyboard, etc, then the basic operating system is working just fine. It could be an XWindows interface issue. In other words, "the system is not hung ... the workloads are waiting on something, and probably timing out." The clues are probably being logged, all right, just not in the logs that you're looking at.

What happens, for example, if you do the ol' Ctrl+Alt+F4 thing to move to an actual terminal-window, bypassing the windowed interface completely?

Oh I did find it in the logs, qv. post #6.

I'm sure now it is because of the disk error, tho not sure why that has to be the case, or how it resolves itself after a few minutes. I'm also still hoping it is some bad blocks so I don't have to replace the HD (I'm going to run a scan today).

[later: e2fsck -c did fix the problem]

I suppose the original post is mostly solved, but -- for posterity, because everyone including me consults stuff like this via google -- I did not want to leave syg00's authoritative seeming but erroneous info unchallenged. Stuff like that can follow a telephone-game like pattern, whereby a year from now I see it mutate into "the load average is the number of sleeping processes divided by the number of uninterruptable sleeping processes", or something

syg00 · 06-15-2011, 09:18 AM

Your arrogance approaches your ignorance. State "D" is uninterruptible sleep. Period.

See the source for sched.c to educate yourself.