LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 11-10-2004, 06:21 PM   #1
qwijibow
LQ Guru
 
Registered: Apr 2003
Location: nottingham england
Distribution: Gentoo
Posts: 2,672

Rep: Reputation: 47
Memtest86 says my pc's fine... but i know it isnt !?


i have a 1.3Gzh porcessor, and 266Mhz DDR Ram.
Processor is Athlon.

My Motherboard uses firmware to set the CPU and memory clock rate. it has 2 settings..
1Gzh CPU and 200Mzh Ram.
or
1.3Ghz CPU and 266Mzh Ram.

ive ran it on its correct faster speed for years, and it been perfect.
a few days ago my machine crashed for the first time ever (while booted in linux)

the crashes are now soo bad it doesnt last more than 1 minute before freezing.

i know its not a Linux software problem because at the same time, WindowsXP also stoped booting. as does the Gentoo Live CD, Knoppix, and Damn Small Linux.

so this HAS to be hardware.

if i underclock the machine to the slower speed, linux works fine, windows is okay,
and it does not crash unless i put very high load on it by re-compiling a kernel.

i have 2 * 256Meg ram chips.... ive tried taking one out, running, and then putting it mck in and trying the other ram chip... it still crashes.. so either both ram chips broke at the same time, or the fault is on the motherboard..

so.. i put in my Gentoo Live CD, and booted into Memtest86 (with the machine set to its correct speed, NOT underclocked)

and what confuses me, is the machine works fine... i left it running for an hour. it comleted 2 complete passes of the memtest cycle. not a single error !!!

why is it thhat when i try to boot windows / Linux / Knoppix 3.4 / Knoppix 2.6 / DSL 0.4 / DSL 0.7 it crashes in seconds at 1.3Ghz...

but will run for hours on memtest86.

whats wrong ?
 
Old 11-10-2004, 06:26 PM   #2
kevinatkins
Member
 
Registered: Jan 2004
Location: cheshire, uk
Distribution: Ubuntu Hoary
Posts: 605

Rep: Reputation: 33
Hi,

It's possibly a result of the processor overheating - particularly since you suggest that the problem is ameliorated to some extent by dropping the clock speed..

Are all processor / case fans running OK? Vents unobstructed? On my machine, a duct runs from a case fan over to the processor, so at first sight it's not possible to see the processor fan running..
 
Old 11-11-2004, 01:23 AM   #3
J.W.
LQ Veteran
 
Registered: Mar 2003
Location: Boise, ID
Distribution: Mint
Posts: 6,642

Rep: Reputation: 87
Other questions to ask - if it's been running fine for the past couple of years, what has changed recently? Have you done any upgrades, kernel recompiles, added new hardware, etc, etc? If there have been no changes, then I think kevinatkin's theory should be investigated. It may be that your CPU fan has either failed, or just isn't spinning fast enough anymore to keep the chip cool. Definitely open up your case to confirm that the fan is working. Similarly, while you're at it verify that the case fans are all in good working order and that airflow is not obstructed.

If you have any other RAM that you could temporarily install, I'd suggest doing it to see if that makes a difference (meaning remove all the RAM that's in there now, and try the machine with entirely different RAM). If new RAM fails to make a difference, all your fans are running normally, and your system temps are OK but machine still fails, then it may be time for a new mobo and/or CPU. Good luck with it -- J.W.
 
Old 11-11-2004, 02:31 AM   #4
Zuggy
Member
 
Registered: Mar 2004
Location: Pocatello, Idaho, USA
Distribution: Ubuntu
Posts: 256

Rep: Reputation: 30
It can't be the CPU Fan, because Memtest86 ran for 2 hours. I had a CPU fan fail once and the processor heats up so fast i couldn't get the machine to last more then 30 seconds. Plus with as hard as he's been trying to find out the problem his system would be dead by now from CPU Meltdown. I'd say it might be a broken Hard Drive but Live Cd's don't work either.

Do you have a PCI IDE card to hook up your cd-rom, hard drive, etc.? If so it might have come lose. Also check you IDE cables, because they may have come lose or one of the wires may have gotten severed. You have a fairly old system, try replacing the battery for your BIOS (little circle battery on Motherboard) I've seen a dead MoBo battery do some srewed up stuff.

This is really screwed up. You can try what i suggeted but it doesn't cover the wide range of problems your having. Something else you might try is opening your system and seeing if you see any burn marks on your hardware. Sometimes some fluke accident will happen where a component will spark and burn out. With the kind of problems your having it would probably be on the MotherBoard.

This is all I can think of right now without knowing more about your system. I would bet money that your problem is a motherboard problem. Why I say that is because it does work at a slower clock speed. I think some component on your motherboard has burned out but at the slower clock speed that section of the motherboard doesn't operate unless put under severe stress.

Disclaimer: I'm not a certified engineer or licesned computer repair specialist, but I have over 5 years of the techtv show The ScreenSavers under my belt and in my own work have seen some pretty screwed up crap.
 
Old 11-11-2004, 06:44 AM   #5
qwijibow
LQ Guru
 
Registered: Apr 2003
Location: nottingham england
Distribution: Gentoo
Posts: 2,672

Original Poster
Rep: Reputation: 47
About replacing the ram...
The only other ram i have is the OLD 100Mzh SDRAM.
when i remplace my DDR Ram with the OLD SD-RAM the machine works perfect...

however to use the old SD-RAM i have to underclock the motherboard... which also is known to make the old ram work okay (if it is the ram)

The Bios Reports my CPU tempreature slowly rises over the hours then levewls off at a constant 65 degrees oC, i cant find a normal operating tempreature for my CPU (AMD Athlon 1.3Ghz) but this seems normal.

taking all the anti static precautions... (working on an Earthed metal surface, wearing an earthing writs band, and only handling the components by the edges)

i completely took the machine apart. i washed and completely dried the CPU Heat sink, removed the dust from all the fans, and other components.

now my cpu fan runs bout 10rpm's faster, and cpu tempreture droped by 5 degreeg.. But thatis improvemtn could just be due to room tempreture differeing.

The crashes have been getting worse... without underclocking the machine, it crashes within a minute on everything now... even memtest86. memtest does not report errors, it just freezes (although the cursor somtimes continues to blink)

Im a student, with not much money to spend on a new computer, and it is absolutly vital that my machine remains in some kind of working order untill the middle of december when my course work is due in.

in short....
i cannot take my computer to a repair shop, as that would leave me without a computer to work on (deadlines are getting close)

i cannot affors to buy a new motherboard + new CPU + new RAM. and im not 100% sure which component is at fault.

ohh, and to answer the other question... nothing has changed with the machine as far as hardware is concerned.

i use Gentoo, so im often compiling new versions of things, but faulty linux software shouldnt break live distro's and other operating systems.

The IDE controller is on-board (SIS chipset).

and ive not bought any new hardware for atleast a year (it was a GeFOrce 4 graphics card ATI linux support sucked)

One day, for no reason my computer crashed... 2 hours later, it crashed again, 30 minutes later it crashed again, then 5 minutes... not it wont even boot.

the degrade in performance was extremely severe.

As for the overheating theory, my mother board has heat sensors, and my kernel is set to warn if anything overheats... the Logs show no such warnings.

Thanks for the help, but from other pople ive talked to, it seems without fully owrking tested componnents to swap in n a trial and error type fashion, it will be very difficult to locate the fault

edit: ohh yeah, and the Bios battery is fine, as is all the bios settings.
there is a bios setting for memory timeing.. (Safe, Default, Fast, Ultra)
ive set that to safe but it makes no difference.

i even tried dissabling my L1 and L2 cache just because i had run out of ideas, all that did was slow the cpu down, but no increace in reliability.

Last edited by qwijibow; 11-11-2004 at 06:47 AM.
 
Old 11-11-2004, 11:50 AM   #6
Zuggy
Member
 
Registered: Mar 2004
Location: Pocatello, Idaho, USA
Distribution: Ubuntu
Posts: 256

Rep: Reputation: 30
Well at least I was right about it not being the fan. My system runs at 65 degrees C for days at a time. The fact that it still sucks with the PC100 RAM means it's not the RAM. Have checked the net for similar problems with SiS-based motherboards? Also when it crashes what does it do specifically? Does it freeze up, shutdown, etc. etc. And when it crashes do you get any beeps. If you get any beeps from your computer when it crashes write down how many times it does and if they're long or short beeps. If it does your computer is telling you whats wrong. All you have to do is look up the beep code on the internet.

Those are the only other things i can think of. I am honesstly perplexed
 
Old 11-12-2004, 01:19 AM   #7
qwijibow
LQ Guru
 
Registered: Apr 2003
Location: nottingham england
Distribution: Gentoo
Posts: 2,672

Original Poster
Rep: Reputation: 47
the machine doesnt suck when i plug in the PC100 Ram... but the thing is, to use the PC100 ram i have to underclock the machine... which i know helps the problem... so i dont know wether its changing the ram thats helping, or just the fact that im underclocking.

good idea about the PC beeps... ill go plug my internal speaker back in !

the problem with the crash, is that it just halts, no mouse movement, no screen changes, no reaction to any controll key sequences. nothing.

and googleing for SIS chipset PC's freesing up returns thousands of results.. all of them windows software crashes

thanks for the help everyone, but it seems im going to have to replace each componnents till i find the bugger
 
Old 11-12-2004, 02:26 AM   #8
qwijibow
LQ Guru
 
Registered: Apr 2003
Location: nottingham england
Distribution: Gentoo
Posts: 2,672

Original Poster
Rep: Reputation: 47
AND... the speaker is silent... no beeps
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
5 PC's and a server how do I get the PC's to access stuff on the server TheHammer Linux - Networking 9 05-07-2007 11:05 PM
Can't use floppy for memtest86 Boby Linux - Newbie 2 11-21-2004 10:13 AM
memtest86 v1.20 runtime? evilop Linux - Software 2 09-26-2004 12:02 AM
Memtest86 and badram mullog Linux - Software 7 05-27-2004 02:04 PM
WTF? isnt linux free...how come lindows isnt? Cycopath81090 Linux - Newbie 11 08-22-2003 08:19 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 12:01 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration