Linux - Hardware This forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux? |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
|
05-21-2006, 04:28 PM
|
#1
|
LQ Newbie
Registered: May 2006
Posts: 2
Rep:
|
What piece of hardware is crashing my computer?
I have an old computer that was given away because Windows wasn't working on it. I thought it might just be a software problem, so I installed Mandrake and it works fine most of the time, but sometimes it crashes for no apparent reason. I have decided that it probably is a hardware problem.
- I have found that it works better if it stays cooler.
- It sometimes works fine for a week or two before crashing.
- The likelihood of it crashing increases when I do something like scp a very large file to it.
- Immediately restarting doesn't work. It needs to sit off for a while before being turned back on.
I know it's old and should just be replaced, but if I could find out which component was failing I might be able to replace it.
So my question is: how do I know which piece of hardware is causing the crashes? What should I be looking for?
Now that I think about it, I suppose I could run some memory or hard drive diagnostics. Can someone recommend something?
Thanks a lot.
--
Tim
|
|
|
05-21-2006, 04:36 PM
|
#2
|
Moderator
Registered: Nov 2002
Location: Kent, England
Distribution: Debian Testing
Posts: 19,192
|
You could use memtest86 to see if it's the memory. You could also check your logs to see if the failing hardware throws out a message before dying.
|
|
|
05-21-2006, 04:37 PM
|
#3
|
LQ Veteran
Registered: Mar 2003
Location: Boise, ID
Distribution: Mint
Posts: 6,642
Rep:
|
Based on your comments that it seems to run better when cooler, and that it needs to wait to cool down for a while before being turned back on, my money would be on the CPU fan being the culprit.
Open up the box, blow out all accumulated dust, and verify that all the system fans are spinning freely and at a fast speed. Additionally, make sure that the fans are installed properly (front fans should pull air into the cabinet, rear fans should blow air out of the cabinet) and that there aren't obstructions in front of the fans. Lastly, you can you a monitoring utility like gkrellm to check your system temps.
Good luck with it and Welcome to LQ!
|
|
|
05-21-2006, 04:39 PM
|
#4
|
Member
Registered: Dec 2004
Location: Malta
Distribution: Debian Sid
Posts: 863
Rep:
|
As you imply, I'd suspect temperature.
The first thing I'd look at is the CPU temperature. Depending on the age of the computer it may have a hardware monitor in CMOS setup or you may be able to get lm_sensors to work if it has sensors.
Easy way though is to check that the fan is rotating fast and silently, remove the heat sink and dust it out and ensure that there is a thermal pad or thermal grease between the heatsink and the CPU.
For memory testing: memtest86 or docmemory, both downloadable. I've come to like memtest86 and it is on the Knoppix CD too.
Is the power supply fan working OK?
Are the ribbon cables obstructing airflow over the memory?
Is the graphics card overheating (use you finger to find out).
Are the unused slots at the back closed off? If not you're messing up the airflow.
Are the air inlets at the front clogged with dust? How about the air inlets to the power supply?
Hope this helps
|
|
|
05-21-2006, 05:08 PM
|
#5
|
Registered User
Registered: Apr 2004
Posts: 560
Rep:
|
There is a little know feature of PSU that cause them to cut out it they overheat. Often refusing to restart for an hour or so until it cools.
De-dust it and stick a few big fans in to keep it all moving nicely.
|
|
|
05-21-2006, 08:41 PM
|
#6
|
LQ Guru
Registered: Jan 2002
Posts: 6,042
Rep:
|
Start by replacing the power supply. When power supplies age, their ability to provide their "advertised" power rating gets worst. I suggest power supplies that costs 100 US dollars or more. Do not get a 50 US dollar power supply that provides 500 watts because it does not provide quality. A 300 watt power supply costing 100 US dollars states quality. As some previous posters said, use compressed air to clean the whole computer. If problems still comes up, check the temperature of the system. If the temperatures are within spec, you could have a problem with the processor, BIOS, or memory. The BIOS can be flashed and the memory can be tested using memtest86. I suggest using memtest86 on a computer that works, so this means taking out the old memory and connected it to a another system. If everything works, there is a possibility that the processor is going.
|
|
|
05-22-2006, 12:22 AM
|
#7
|
Member
Registered: Dec 2004
Location: Malta
Distribution: Debian Sid
Posts: 863
Rep:
|
Since the crashes seem to be linked with activity, I wouldn't replace the PSU until the other simple things have been done.
I was waiting for the results before suggesting a look at voltages or simply replacing the PSU. It's not a thermal shutdown but could be voltage drop.
|
|
|
05-22-2006, 03:17 PM
|
#8
|
LQ Newbie
Registered: May 2006
Posts: 2
Original Poster
Rep:
|
Wow. I didn't think I would get such a response!
It sounds like I should consider overheating first. I already have taken the cover off and a put a little fan blowing into it (but only a very little one), and the closet door is open. These things have helped, but it still crashes.
Since I don't have an X-server installed, gkrellm won't help me much.
I found a lm_sensors file sitting in my /etc/rc.d/init.d directory, but when I run "service lm_sensors status", it says "No sensors found!".
When I run sensors_detect, this is some of the output it produces (I took the default choices):
We can start with probing for (PCI) I2C or SMBus adapters.
You do not need any special privileges for this.
Do you want to probe now? (YES/no):
Probing for PCI bus adapters...
Use driver `i2c-sis5595' for device 00:01.0: Silicon Integrated Systems SIS5595
Use driver `to-be-written' for device 00:00.0: Silicon Integrated Systems SIS5581/5582/5597/5598 (To be written - Do not use 5595 drivers)
Probe succesfully concluded.
We will now try to load each adapter module in turn.
Module `i2c-sis5595' already loaded.
Load `to-be-written' (say NO if built into your kernel)? (YES/no):
FATAL: Module to_be_written not found.
Loading failed... skipping.
If you have undetectable or unsupported adapters, you can have them
scanned by manually loading the modules before running this script.
...
Probing for `Silicon Integrated Systems SIS5595'
Trying general detect... Success!
(confidence 9, driver `sis5595')
...
Now follows a summary of the probes I have just done.
Just press ENTER to continue:
Driver `sis5595' (should be inserted):
Detects correctly:
* ISA bus, undetermined address (Busdriver `i2c-isa')
Hint: Try forcing the chip address. Consult the documentation
of particular chip for details and address value.
Chip `Silicon Integrated Systems SIS5595' (confidence: 9)
I'm not exactly sure what this means. I think I have a SIS5595 chip but no driver for it. I'm sorry I don't know that much about hardware or kernal stuff or modules or drivers.
I just peaked into the box. It was a little dark, but I found a fan that was spinning nicely, and a chip that says "SIS 5598" on it.
So what do I do now?
|
|
|
05-23-2006, 01:01 AM
|
#9
|
Member
Registered: Dec 2004
Location: Malta
Distribution: Debian Sid
Posts: 863
Rep:
|
lm_sensors can be difficult to set up because different mobo makers may use different ways of reading the same chip (look at /etc/sensors.conf)
Do you have a hardware monitor section in BIOS setup? That will show temperatures and allow you to calibrate km_sensors.
However there is more to overheating that just the CPU so i strongly recommend you go through the general checklist in my earlier post.
|
|
|
All times are GMT -5. The time now is 11:30 AM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|