LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   Previously perfectly good mahcine freezing/rebooting randomly after reinstall (https://www.linuxquestions.org/questions/linux-general-1/previously-perfectly-good-mahcine-freezing-rebooting-randomly-after-reinstall-944552/)

ticktockhouse 05-12-2012 05:27 AM

Previously perfectly good mahcine freezing/rebooting randomly after reinstall
 
Hello all,

I just thought I'd share a problem that I've been having with a system I have at home. Believe me I've searched many a forum for people having similar problems, but I just wondered if anyone here might have any more suggestions.

I'm a seasoned Linux user. I've had a various Linux boxes acting as a home server for around 10 years now, and I've been working full-time with Linux professionally for around 8 years now, yet I've not really seen anything like this.

First some background:

I had a Fedora 14 install on the box in question which was working with no problems at all. I decided to update to something more...well, up-to-date, and decided on Ubuntu 12.04, as they'd just brought out this shiny new LTS release. The install seemed to go fine, until I booted and started to use it.

It just started randomly rebooting. I then tried Debian and Fedora 16, and these seemed to randomly freeze up when in X rather than rebooting. Since I was rapidly losing the will to live while this was happening (I thought it would take a couple of hours at most, originally), I din't keep a particularly good record of what crashed and when, though it does seem to be X related (F16 froze up during the install). The only distro that seems stable at the moment is Gentoo, which I reluctantly installed as I don't know it all that well (used it many years ago), and find it a bit fiddly for day-to-day usage, since I spend all day fiddling with Linux boxes...

My hardware is as follows:

- ALiveNF6G-DVI motherboard
- So Nforce chipset (sata_nv kernel module)
- Geforce 6150 gfx card (nouveau?)
- Onboard ethernet (forcedeth module)
- Athlon 64 X2 BE-2400 CPU
- 500 GB SATA HDD
- IDE DVD-ROM (had problems with SATA one)

NB, I substituted the onboard Geforce card with a Quadro FX 3450/4000 SDI in case that was what was causing the problem.

I also ran memtest86 overnight with no errors whatsoever.

I also tried installing the NVIDIA proprietary driver once I managed to get Ubuntu installed for long enough. I've never had enough time before the inevitable freeze-up to run any more tests/install packages etc.

This feels to me like a problem with the nouveau driver (which doesn't show up with lsmod on Gentoo) and/or the 3.x kernel, as the system was perfectly fine when it still had F14 on it, and only started having problems with older distros (I didn't try putting F14 back on it, since there are only so many hours in the day). Every time, it's the same, I get an OS on there, and come back after a few hours to find it frozen - not much use for a home server/family workstation machine.

If anyone has had similar experiences, I'd be interested to know how you got on.

Thanks for taking a look at my rather frustrating problem.

onebuck 05-12-2012 06:54 AM

Member response
 
Hi,

Quote:

memtest86+ <- 'memory tester which is based on memtest86 v3.0, and provides an up-to-date version of this useful tool, which aims to be as reliable as the original. It has been fixed to work on AMD64 systems, and also properly detects all current CPUs and motherboard chipsets. The project supports ECC polling for AMD64, i875P, and E7205, and displays some useful settings for the most popular chipsets'
I prefer 'memtest86+' on newer equipment.

Any logged information in '/var/log' or 'dmesg'?
Why not try a LiveCD to see how things work;

Quote:

KNOPPIX <- LiveCD is a good choice to boot the system. You have several boot options & kernels available.

Or use diagnostics from;

UBCD Ultimate Boot CD <- 'UBCD allows users to run floppy-based diagnostic tools from most CDROM drives on Intel-compatible machines, no operating system required. The cd includes many diagnostic utilities.'

OR


SystemRescueCd <- 'is a Linux system on a bootable CD-ROM for repairing your system and recovering your data after a crash. It aims to provide an easy way to carry out admin tasks on your computer, such as creating and editing the partitions of the hard disk. It contains a lot of system utilities (parted, partimage, fstools, ...) and basic tools (editors, midnight commander, network tools).' + 'Online-Manual
You could use system diagnostics to confirm that everything is working. Be sure to check the PSU as this can be the issue. What about the fans & cooling? Heat sink & CPU fan OK? PAD or compound? Sometimes the heat sink compound dries out and will need replacement.

What about trying 'ssh' into the machine before and when the machine locks?

jefro 05-12-2012 10:16 PM

At first glance I'd say you don't have enough memory available to the OS.

Reload the older OS and see if it is stable again.

ticktockhouse 05-28-2012 04:07 AM

Hello,

Thanks for the replies. I don't have any spare PSUs or money to buy any (or any other spare hardware, for that matter, so I couldn't really test that. I forgot to mention above that I have 4 GB of RAM, which I think should be enough these days.

I haven't really got anything to add other than to say Gentoo has been perfectly stable for weeks now. I'm only now getting to the stage of configuring X and GUI-type things, so the root cause remains a mystery, though it feels driver- or at least software-related to me, as I said above, everything was cool before I foolishly decided to change it, and Gentoo is rock-solid. So I put it down to something that's turned on by default in newer OS's that my particualar hardware combination doesn't agree with.

Besides, Gentoo isn't so bad once you get used to it, and it does tend to teach you a fair amount about how Linux actually works.

I'd still be interested to know if anyone has similar problems with this hardware.

Thanks again for the replies..


Jerry

jefro 05-29-2012 11:47 AM

This is not really an uncommon issue. Fedora 14 running fine for years and then Fedora 16 fails.

You have to understand that a computer is an almost unusable product when sold. It is barely on the edge of working. I used to work at a large computer maker where we would have to warranty thousands of boards when the customer updated the OS. There is just too many parts on a board that could go wrong on a different os.

From your test you'd have to reload the old working OS to the exact state it was when it worked and see if it is stable. If so then we can assume that some issue has gone on with the newer distro's. It could be as simple as a bios setting needed to correct or change some motherboard chips or timers.

Fedora kind of the test disro and Gentoo kind of the stable so any number of kernel settings would be different.

ticktockhouse 05-30-2012 01:28 PM

Hey Jefro,

I appreciate that input. Actually, I'm pretty happy with Gentoo. Last looked at it years ago, and it actually taught me most of what I know, still, about the core Linux stuff. All the other stuff is the intricacies of certain distributions like their package managers, etc. Portage works pretty well, it just takes time to compile everything from source. For the main packages at least, there's virtually no intervention to the complex source-compiling process other than "emerge blah".

Just followed the really-easy guide to installing XFCE, to actually get a GUI on there and X works fine, XFCE looks good, and the graphics parts of the apps I installed like Virtualbox are already there in the menus!

Sweet.

PS didn't do anything special with the kernel, just ran "genkernel" to build myself one (I must admit to a *little* tinkering there to get it working!).

Anyway, going to carry on playing with it now...


All times are GMT -5. The time now is 02:23 PM.