LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware
User Name
Password
Linux - Hardware This forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?

Notices


Reply
  Search this Thread
Old 05-21-2006, 04:28 PM   #1
tkoop
LQ Newbie
 
Registered: May 2006
Posts: 2

Rep: Reputation: 0
What piece of hardware is crashing my computer?


I have an old computer that was given away because Windows wasn't working on it. I thought it might just be a software problem, so I installed Mandrake and it works fine most of the time, but sometimes it crashes for no apparent reason. I have decided that it probably is a hardware problem.

- I have found that it works better if it stays cooler.
- It sometimes works fine for a week or two before crashing.
- The likelihood of it crashing increases when I do something like scp a very large file to it.
- Immediately restarting doesn't work. It needs to sit off for a while before being turned back on.

I know it's old and should just be replaced, but if I could find out which component was failing I might be able to replace it.

So my question is: how do I know which piece of hardware is causing the crashes? What should I be looking for?

Now that I think about it, I suppose I could run some memory or hard drive diagnostics. Can someone recommend something?

Thanks a lot.

--
Tim
 
Old 05-21-2006, 04:36 PM   #2
XavierP
Moderator
 
Registered: Nov 2002
Location: Kent, England
Distribution: Debian Testing
Posts: 19,192
Blog Entries: 4

Rep: Reputation: 475Reputation: 475Reputation: 475Reputation: 475Reputation: 475
You could use memtest86 to see if it's the memory. You could also check your logs to see if the failing hardware throws out a message before dying.
 
Old 05-21-2006, 04:37 PM   #3
J.W.
LQ Veteran
 
Registered: Mar 2003
Location: Boise, ID
Distribution: Mint
Posts: 6,642

Rep: Reputation: 87
Based on your comments that it seems to run better when cooler, and that it needs to wait to cool down for a while before being turned back on, my money would be on the CPU fan being the culprit.

Open up the box, blow out all accumulated dust, and verify that all the system fans are spinning freely and at a fast speed. Additionally, make sure that the fans are installed properly (front fans should pull air into the cabinet, rear fans should blow air out of the cabinet) and that there aren't obstructions in front of the fans. Lastly, you can you a monitoring utility like gkrellm to check your system temps.

Good luck with it and Welcome to LQ!
 
Old 05-21-2006, 04:39 PM   #4
davcefai
Member
 
Registered: Dec 2004
Location: Malta
Distribution: Debian Sid
Posts: 863

Rep: Reputation: 45
As you imply, I'd suspect temperature.

The first thing I'd look at is the CPU temperature. Depending on the age of the computer it may have a hardware monitor in CMOS setup or you may be able to get lm_sensors to work if it has sensors.

Easy way though is to check that the fan is rotating fast and silently, remove the heat sink and dust it out and ensure that there is a thermal pad or thermal grease between the heatsink and the CPU.

For memory testing: memtest86 or docmemory, both downloadable. I've come to like memtest86 and it is on the Knoppix CD too.

Is the power supply fan working OK?

Are the ribbon cables obstructing airflow over the memory?

Is the graphics card overheating (use you finger to find out).

Are the unused slots at the back closed off? If not you're messing up the airflow.

Are the air inlets at the front clogged with dust? How about the air inlets to the power supply?

Hope this helps
 
Old 05-21-2006, 05:08 PM   #5
otchie1
Registered User
 
Registered: Apr 2004
Posts: 560

Rep: Reputation: 30
There is a little know feature of PSU that cause them to cut out it they overheat. Often refusing to restart for an hour or so until it cools.

De-dust it and stick a few big fans in to keep it all moving nicely.
 
Old 05-21-2006, 08:41 PM   #6
Electro
LQ Guru
 
Registered: Jan 2002
Posts: 6,042

Rep: Reputation: Disabled
Start by replacing the power supply. When power supplies age, their ability to provide their "advertised" power rating gets worst. I suggest power supplies that costs 100 US dollars or more. Do not get a 50 US dollar power supply that provides 500 watts because it does not provide quality. A 300 watt power supply costing 100 US dollars states quality. As some previous posters said, use compressed air to clean the whole computer. If problems still comes up, check the temperature of the system. If the temperatures are within spec, you could have a problem with the processor, BIOS, or memory. The BIOS can be flashed and the memory can be tested using memtest86. I suggest using memtest86 on a computer that works, so this means taking out the old memory and connected it to a another system. If everything works, there is a possibility that the processor is going.
 
Old 05-22-2006, 12:22 AM   #7
davcefai
Member
 
Registered: Dec 2004
Location: Malta
Distribution: Debian Sid
Posts: 863

Rep: Reputation: 45
Since the crashes seem to be linked with activity, I wouldn't replace the PSU until the other simple things have been done.

I was waiting for the results before suggesting a look at voltages or simply replacing the PSU. It's not a thermal shutdown but could be voltage drop.
 
Old 05-22-2006, 03:17 PM   #8
tkoop
LQ Newbie
 
Registered: May 2006
Posts: 2

Original Poster
Rep: Reputation: 0
Wow. I didn't think I would get such a response!

It sounds like I should consider overheating first. I already have taken the cover off and a put a little fan blowing into it (but only a very little one), and the closet door is open. These things have helped, but it still crashes.

Since I don't have an X-server installed, gkrellm won't help me much.

I found a lm_sensors file sitting in my /etc/rc.d/init.d directory, but when I run "service lm_sensors status", it says "No sensors found!".

When I run sensors_detect, this is some of the output it produces (I took the default choices):

We can start with probing for (PCI) I2C or SMBus adapters.
You do not need any special privileges for this.
Do you want to probe now? (YES/no):
Probing for PCI bus adapters...
Use driver `i2c-sis5595' for device 00:01.0: Silicon Integrated Systems SIS5595
Use driver `to-be-written' for device 00:00.0: Silicon Integrated Systems SIS5581/5582/5597/5598 (To be written - Do not use 5595 drivers)
Probe succesfully concluded.

We will now try to load each adapter module in turn.
Module `i2c-sis5595' already loaded.
Load `to-be-written' (say NO if built into your kernel)? (YES/no):
FATAL: Module to_be_written not found.
Loading failed... skipping.
If you have undetectable or unsupported adapters, you can have them
scanned by manually loading the modules before running this script.

...

Probing for `Silicon Integrated Systems SIS5595'
Trying general detect... Success!
(confidence 9, driver `sis5595')

...

Now follows a summary of the probes I have just done.
Just press ENTER to continue:

Driver `sis5595' (should be inserted):
Detects correctly:
* ISA bus, undetermined address (Busdriver `i2c-isa')
Hint: Try forcing the chip address. Consult the documentation
of particular chip for details and address value.
Chip `Silicon Integrated Systems SIS5595' (confidence: 9)



I'm not exactly sure what this means. I think I have a SIS5595 chip but no driver for it. I'm sorry I don't know that much about hardware or kernal stuff or modules or drivers.

I just peaked into the box. It was a little dark, but I found a fan that was spinning nicely, and a chip that says "SIS 5598" on it.

So what do I do now?
 
Old 05-23-2006, 01:01 AM   #9
davcefai
Member
 
Registered: Dec 2004
Location: Malta
Distribution: Debian Sid
Posts: 863

Rep: Reputation: 45
lm_sensors can be difficult to set up because different mobo makers may use different ways of reading the same chip (look at /etc/sensors.conf)

Do you have a hardware monitor section in BIOS setup? That will show temperatures and allow you to calibrate km_sensors.

However there is more to overheating that just the CPU so i strongly recommend you go through the general checklist in my earlier post.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
building shell command piece by piece aunquarra Linux - General 7 05-02-2006 07:43 AM
Linksys WUSB11 V1 (Old Piece of Wireless Hardware) on Suse 9.3 Desert Linux - Wireless Networking 1 07-23-2005 05:09 PM
How does Linux decide what device in /dev to use for a piece of hardware? BrianHenderson Linux - Hardware 4 08-30-2004 05:06 PM
what is this funny piece of hardware? andzerger Linux - Hardware 3 03-10-2004 11:33 PM
computer crashing a LOT versaulis Linux - Hardware 1 10-16-2003 05:06 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware

All times are GMT -5. The time now is 11:30 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration