SuSE 10.0 Crashing Repeatedly
I have a LinuxCertified laptop (AMD Athlon 64 Mobile) running SuSE 10.0 (x64_64). For a while, it's bee crashing unexpectedly. It started off that it was overheating and then just shutting off. I figured it was due to the overheating. Once I vacuumed it out, it seemed fine. It happened a few times in a row, every few days, for a week or so. Then it let up for about 2 months. The laptop is on 24x7.
It just started happening again today. I was browsing the web, then the mouse froze, hung like that for a second or so, and then hard power down. I opened it up and vacuumed, and also noticed a pretty large amount of heat, but it seemed to be radiating from the hard drive. I can't trace any software problems in the logs. Any thoughts? Has anyone heard of any hardware problems that could cause this? How about software. Some pertinent log entries: the last entries in /var/log/warn before crash: Dec 19 19:43:57 antmanLaptop smbd[6346]: [2006/12/19 19:43:57, 0] printing/print_cups.c:cups_cache_reload(85) Dec 19 19:43:57 antmanLaptop smbd[6346]: Unable to connect to CUPS server localhost - Connection refused Dec 19 19:44:02 antmanLaptop kernel: ACPI-0071: *** Warning: Invalid 'package' argument Dec 19 19:44:02 antmanLaptop kernel: ACPI-0285: *** Warning: Invalid _PSS data Dec 19 19:44:02 antmanLaptop kernel: cpu_init done, current fid 0x0, vid 0x18 Dec 19 19:44:02 antmanLaptop SuSEfirewall2: Warning: ip6tables does not support state matching. Extended IPv6 support disabled. Dec 19 19:44:07 antmanLaptop udevd[2288]: get_netlink_msg: no ACTION in payload found, skip event 'mount' Dec 19 19:44:10 antmanLaptop udevd[2288]: get_netlink_msg: no ACTION in payload found, skip event 'umount' Dec 19 19:45:00 antmanLaptop hp: unable to open /var/run/hpiod.port: No such file or directory: prnt/hpijs/hplip_api.c 75 <END OF LOG BEFORE CRASH> In the last kdm.log before crash: FATAL: Module fglrx not found. [drm] failed to load kernel module "fglrx" FATAL: Module fglrx not found. [drm] failed to load kernel module "fglrx" (EE) fglrx(0): DRIScreenInit failed! <END OF LOG BEFORE CRASH> Any suggestions for other places to look? Other logs to post? How do I start to narrow down a hardware problem? The manufacturer only has a 1 year warranty, they want something like $200 just to look at it... and if I have to spend that much without even knowing what the repair bill will be, I;d rather just save up for a brand name laptop. Thanks for any help, Jason |
You can use the smartctl command to check the temperature and health of the hard drive. If the drive temperature is consistently high, it's usually an indication of imminent failure (drive bearing). For example:
Code:
# smartctl -iAH /dev/hda |
I don't know what it means, but the smartctl command is not on my system.
Is it a part of SuSE 10.0 by default? |
You may need to install it separately. A quick Google search yielded these packages for SuSE 10.0:
smart-addons smart smart-gui smartmontools |
All times are GMT -5. The time now is 11:31 PM. |