LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Hardware (http://www.linuxquestions.org/questions/linux-hardware-18/)
-   -   How do I diagnose a computer freeze? (http://www.linuxquestions.org/questions/linux-hardware-18/how-do-i-diagnose-a-computer-freeze-783032/)

hosler 01-18-2010 01:54 PM

How do I diagnose a computer freeze?
 
My old custom PC is starting to die on me. Half the time the bios won't POST, and when it does beep and start to load, the system will freeze at random. It will freeze in bios settings, loading up the OS, or just whenever it wants to. Sooner or later I actually get the computer running, and it will freeze less and less frequently until it will be able to stay unfrozen for days as long as I don't reboot it. Rebooting the machine will usually start the whole freezing process all over again.
Anyway, I want to learn how to figure out what is not working on my computer. I have been switching out RAM, PSUs, video cards, and bios settings to see if I could fix it, but nope. So how do I go about diagnosing this problem? I'm pretty much assuming it's a hardware one.

AleLinuxBSD 01-18-2010 02:24 PM

Maybe the bios battery is discharged and you have lost the correct settings on the BIOS.

devnull10 01-18-2010 03:20 PM

Have you tried resetting the bios using the onboard jumper?

hosler 01-18-2010 06:40 PM

The bios is fine. Its been reset a few times, and I just replaced the CMOS battery.

H_TeXMeX_H 01-19-2010 05:26 AM

Personally I would run all the software tests I can, such as memtest86, smartctl, etc. If you want you can use:
http://ubcd.sourceforge.net/

It can also be the PSU, you said you switched it out, but with a newer one, one that produces enough power to run it ?

Also, when did this start happening, any context clues ? like what you did right before it started ? nothing ?

If the PC crashes while running Linux check /var/log/messages and /var/log/syslog for clues.

onebuck 01-19-2010 07:14 AM

Hi,

In addition go what 'H' has said. I would first look at the BIOS health to check conditions. Swapping out a PSU and not solving the problem doesn't mean the issue is still not power related. You could be having a load issue with peripherals, MB or whatever.

Break the system down to the minimum hardware, test it. If the system does boot then add each component back one at a time.

BTW, check that CPU HSF and a general system cleaning wouldn't hurt. You may need to clean and re-compound the heat sink.

:hattip:

cgtueno 01-20-2010 01:17 AM

Hi hosler

onebuck is absolutely correct in his approach.

1) Thoroughly clean the system.
Remove all dust and debris. Play special attention to the CPU fan.
If possible remove the CPU fan clean it and the underlying heatsink.
2) Ensure all leads, cables, and expansion cards are correctly connected
3) Ensure that the CPU is seated properly, and that the CPU and the heatsink are flush mounted together. In the event that the CPU and the heatsink have become separated clean the surfaces (very carefully) and apply (suitable) heatsink compound, re-assemble, and reinstall.
4) Disconnect all bar the essential devices from the motherboard and re-test
5) Inspect the PSU. Is the PSU fan working, is the unit dust free (do NOT open the PSU under any circumstances)
6) Check the BIOS health state for the PC if is successfully passes POST tests
7) Run memTest86 (download from the www) to check the RAM integrity
8) If you have doubts about the PSU then replace it with a known working spare PSU.
The PSU can be tested by disconnecting it from the PC, connecting a dummy load, and using a multi meter to check the output voltage levels. Alteratively you can use a plug in PSU tester to check the PSU (available from retail shops and the www vendors - but beware that these have limitations).

If it is not dust contamination, bad RAM, faulty PSU, then it may be that the motherboard and/or CPU are defective.

In essence it is really a systematic clean, minimum load test, and systematic reconnect approach that is required.

Commercially faulty motherboards are generally checked for defects using a POST adapter, this is an adapter that is plugged into the motherboard's back plane. The card monitors the bus traffic and indicates the stage that the machine has reached during POST. When the machine fails then the board displays information about the last state that the motherboard reached. Generally these cards are not available at retail level.

It is very hard (almost impossible) to determine whether the PC has indeed reached a "frozen" (no CPU processing activity) state without resorting to the use of a logic analyzer or at least an oscilloscope.

Hope that assists

Regards

Chris


All times are GMT -5. The time now is 02:34 AM.