LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   System freezes often, not sure why (https://www.linuxquestions.org/questions/linux-general-1/system-freezes-often-not-sure-why-575163/)

Alotau 08-06-2007 02:42 PM

System freezes often, not sure why
 
Hello,

I am running RHEL4, Linux version 2.6.9-5.EL. Sometimes my entire system freezes. No mouse, no keyboard, just a static image on the screen. Oh, the caps lock LED flashes slowly when this happens and I find that odd. I found one reference to that:

http://www.linuxquestions.org/questi...d.php?t=464015

My freezes typically happen when I am doing some compiling. Currently it is happening while making a large piece of software, sometimes it happens while running eclipse. I can't nail down any other specifics than that and it is possible that it freezes other times as well--I'll have to start keeping better track. This happens 1-10 times per week, I'd say. Haven't found a completely reproducible case as of yet. Let's see, what other vagueness can I describe? I've tried doing some "Alt+SysReq" stuff to safely shut down when this happens, but that doesn't seem to work, I have to press and hold power and just hope my hard drive plays nice when I turn the system back on (it has thus far).

Someone in my office suggested that I get some more memory. I did that, up to 2GB now, but it hasn't stopped the freezes (though the restarts are a little quicker!).

The problem doesn't seem dependent on the overall system load. I've had it freeze while compiling with Firefox, eclipse and other stuff open along with streaming Internet radio. I've also had it freeze with just one terminal window open.

If any further info would be helpful, I'll gladly provide what I can. Would love to track this problem down.

Thanks in advance...

stress_junkie 08-06-2007 03:19 PM

Is it possible that the compiler is just stealing the CPU such that the system won't respond to interactive input? I have found that, IMO, the Linux scheduler stinks when you have one busy job.

I have found that a CPU bound process can coexist with an interactive session if the CPU bound process has a nice priority of 19. Try running your compiler at priority -19, which is nice 19, and see what happens.

raskin 08-06-2007 03:24 PM

Well, not if Scroll Lock blinks in sync with CapsLock.. Blinking Scroll and Caps in sync (rather slowly) is the way Linux Kernel informs user about kernel panic (possibly heavy enough for outputting any message to the log to be dangerous). You can have broken RAM, for example...

Edit: kernel panic will be caused by data corruption, but it can be caused, in turn, by physical problems with RAM.

Alotau 08-06-2007 03:25 PM

Thanks for the reply. It isn't just a slow system making it hard for input devices to work, everything is frozen. There is no hard drive activity, the caps lock light blinks and it won't recover no matter how long I leave it that way. I hadn't thought of re-niceing things. Might try it if no one else has encountered this specific problem before.

Alotau 08-06-2007 03:27 PM

Quote:

Originally Posted by raskin
kernel panic will be caused by data corruption, but it can be caused, in turn, by physical problems with RAM.

Hmmmm, thanks for the reply.

Since I've upgraded my RAM, I have had the same problem. So either the old RAM and the new RAM had problems or it's something else. Is there a good RAM checking utility? I've never thought to look for one...

raskin 08-06-2007 03:29 PM

Have you replaced RAM or added it? In the second case you'll still have problems. Memtest86+ is an CD-based RAM-checkin utility.

Alotau 08-06-2007 03:33 PM

I have two slots. My first upgrade, I added to the second (empty) slot. The second upgrade, I replaced the orginal (smaller) stick. In all combinations I've had the freezing problem.

raskin 08-06-2007 03:42 PM

Well, then problem is probably not in physical memory problem... By the way, I hope your system doesn't use swap having 2GB RAM? And what about Scroll Lock LED - does it blink in sync?

Alotau 08-06-2007 03:50 PM

Quote:

Originally Posted by raskin
what about Scroll Lock LED - does it blink in sync?

The caps lock is the only blinking LED. Honestly, I haven't checked EVERY time that it freezes, but the 50% of the time that I have noticed, it was only the caps lock.


Quote:

Originally Posted by raskin
I hope your system doesn't use swap having 2GB RAM?

Well, I might be? The system was set up with 512MB originally. I just opened it up an put in the RAM. My /proc/meminfo looks like this right now:

MemTotal: 2075648 kB
MemFree: 1181176 kB
Buffers: 44748 kB
Cached: 628948 kB
SwapCached: 0 kB
Active: 334420 kB
Inactive: 489124 kB
HighTotal: 1179072 kB
HighFree: 394816 kB
LowTotal: 896576 kB
LowFree: 786360 kB
SwapTotal: 1044216 kB
SwapFree: 1044216 kB
Dirty: 124 kB
Writeback: 0 kB
Mapped: 232436 kB
Slab: 58632 kB
Committed_AS: 537532 kB
PageTables: 2736 kB
VmallocTotal: 106488 kB
VmallocUsed: 2748 kB
VmallocChunk: 103540 kB
HugePages_Total: 0
HugePages_Free: 0
Hugepagesize: 4096 kB


So there is 1GB swap space set up? But isn't being used right now? Am I reading that correctly? More to the point, what is the concern with using swap with 2GB of RAM? Should I disable it? If so, how? Thanks for the help...

raskin 08-06-2007 03:58 PM

Hm, so it is even different from kernel panics I have seen. About swap - no, it is not that bad if for some reason under heavy load your OS decides to use some swap. Its all about swap corruption that would lead to memory corruption even when you change all your memory. If you could reproduce the panic from console mode, you could get some error output. Also maybe the problem is in some driver; people say that power control (ACPI) is relatively frequent reason, so try to boot with adding noacpi option to kernel parameters.

Alotau 08-06-2007 04:11 PM

Quote:

Originally Posted by raskin
maybe the problem is in some driver; people say that power control (ACPI) is relatively frequent reason, so try to boot with adding noacpi option to kernel parameters.

So if I try this and just don't have a freeze for a week or two, we could say that was probably the problem? Will I lose any functionality from trying this? Of course, I'd love to lose the "random freeze" functionality, so it's probably worth it.

raskin 08-06-2007 05:39 PM

For desktop you'll lose not that much. You'll lose, for example, 'init 0 by power button' functionality that you can configure with acpi. Turning computer off after init 0 will have to be done through APM, it can be better or worse.

Alotau 08-06-2007 05:56 PM

Thanks Raskin,

I'll post again if the proposed fix doesn't work.

mjones490 08-07-2007 01:26 PM

Have you tried booting with another kernel image? It's possible that your current kernel image could be have been corrupt somehow.

Alotau 08-08-2007 02:57 PM

Update:

I haven't tried disabling drivers or upgrading my kernel, but I have found a way to force the system to freeze. Based on the information below, it looks like it is not a memory issue, and probably not a driver issue?

I have a makefile. When I use this makefile, my system freezes. When I manually (on the command line) perform the commands in the makefile, there is no freezing. Weird. It is a very simple makefile, nothing fancy (I didn't write this, it is just one part of a larger project):

Code:

SHELL = /bin/sh

all :
        javac -Xlint:unchecked -d classes batch_wizard/*.java

clean :
        @echo "removing .class files..."
        -@$(RM) classes/batch_gui/*.class

clobber : clean

I added the -Xlint:unchecked flag while tracking this down... it freezes with or without it.

Now on the command line while in the proper directory I can just use:

Code:

javac -Xlint:unchecked -d classes batch_wizard/*.java
And everything works fine. With the makefile, I can watch it go through compiling to the last file (I tried it with the -verbose flag) and right when you'd expect your command line prompt to come back, that is when the system freezes. So is there something weird with GNU Make? I am using version 3.80 if that matters.

I tried commenting out the first line (SHELL = /bin/sh) and that had no visible effect.

Any advice on where to go from here? I can provide more details if helpful...

Thanks again.


All times are GMT -5. The time now is 08:33 AM.