Linux - HardwareThis forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
This morning I installed Slackware 13 i386 and started configuring it. I've got a problem as my laptop has turned off unexpectedly twice since I started using Slack 13. I don't know if it's related to Slackware 13 but it has never happened before on this laptop with Slackware 12.2 or any other system.
The first guess would be overheating. I checked the laptop's BIOS but could not find anything related to cooling down. The laptop seems quite hot but it's nothing unusual for it. I also noticed that acpi is not installed by default. Would it make much difference?
The laptop is pretty new (6 months) and the cooling module seems to be clean. The only thing that differs from my previous installations is that for the first time I installed ATI's fglrx and enabled desktop effects. Could it be a buggy video driver?
See if there are any interesting messages in /var/log/messages
If it's a software controlled shutdown, it should say why.
(Eg overheating -> Shutting down NOW)
Improve the cooling - does this stop the shutdowns?
It seems to be fine now. I left the laptop on overnight and so far so good. It didn't look like a software controlled shutdown as it just switched off as if you had removed all sources of power at once.
I didn't find anything suspicious in /var/log/messages. Unfortunately the logging system in BIOS is disabled.
I'll just wait and see if it happens again. I was thinking of logging in the temperature reading every 10 seconds. I've got some problems with it though. I don't seem to be able to read temperature. The /proc/acpi/thermal_zone is empty (I installed acpi). After some googling I'm still none the wiser:
lm_sensors are installed:
Code:
ls /var/log/packages/ | grep lm_sensors
lm_sensors-3.1.1-i486-1
Apparently, no sensors have been detected on my laptop:
Code:
# sensors
No sensors found!
Make sure you loaded all the kernel drivers you need.
Try sensors-detect to find out which these are.
After going through all the stages, sensor-detect also concluded that I've got no sensors whatsoever.
The first I knew of it was when I installed a new distro about 3-4y ago, and it kept printing "Over temperature PANIC" all over my terminal, when, in fact everything was OK. Something wasn't calibrated properly and I couldn't be a**ed to fix it.
I don't overclock, my MoBos are dusted sometimes, the PCs continue to work, so I have always disabled it. I hate being nagged. One of the reasons I quit win.
Bottom line: If there's nothing about forcing a shutdown in messages then it's probably a hardware fault. Try running memtest on a hot day, or put a blanket over it.
Last thought: It's a laptop: It has batteries / needs power. Was everything plugged in and switched on?
A couple of yrs ago I was halfway through a presentation when my linux laptop powered off.
Damn! I had plugged in the PSU (good), but not switched it on at the socket (bad).
To my great surprise, when power was restored, the presentation was where I left it (or it had left me).
Yeah, the laptop was connected properly and running on AC power. Right after it switched off I booted it and checked the battery indicator - it was 100% charged.
It hasn't happened again though so I'm getting less and less concerned. I'll do the memtest though.
Just reviewing the posts.
What make and model laptop are you using ?
It's curious that you are experiencing system shutdowns (which I take to be power off events), and have not experienced any systems hangs (which would be indicative of a hardware, or over heating problem).
On really old laptops I have seen problems with incorrect hardware responses to ACPI activities; powering off instead of entering standby, and vice versa type events), but not of late.
Is it possible that the external power supply failed ? (External PSU has an intermittent fault, external PSU flyleads are faulty, AC wall outlet is faulty, etc ?) The reason I ask is it sounds suspiciously like the laptop wound up running on batteries and ran out of power.
May I suggest that you thoroughly examine the external power supply and associated cables, and try using a different AC wall outlet. Also test the wall outlet with an AC test plug (in Australia these can be purchased from a local electrical products supplier - they indicate whether the wiring is correct).
Just a thought. I have encountered many laptops with faulty external PSU connectors, stretched leads, etc over the years. Often the simplest explanation is the cause.
Thanks for your suggstions.
I might be talking complete rubbish now but I'm inclined to think that it's something to do with acpi.
After it shut down twice on me on Saturday I installed acpi (not installed by default). The laptop was constantly on till last night (4 days). Nothing happened. I also successfully completed memtest86 (with a towel covering the laptop) with no errors. Everything was brilliant.
Last night I booted into another partition with a fresh install of Slackware (without acpi installed by default) to play with fluxbox. Guess what? It shut down after 1 hour. As I said I don't know if it's possible that acpi (or some other power management program) can be responsible for it, but that's what my inexperienced mind would think.
The kernel will only find sensors it can load the modules for.
Here's what acpitool -t looks like on my box
bash-3.1$ acpitool -t
Thermal zone 1 : ok, 50 C
Trip points :
-------------
critical (S5): 105 C
passive: 95 C: tc1=1 tc2=2 tsp=100 devices=C000 C001
active[0]: 75 C: devices=C39B
active[1]: 65 C: devices=C39C
active[2]: 55 C: devices=C39D
active[3]: 40 C: devices=C39E
This is actually 4 speeds of the same fan. Here's the modules I load for the sensors (which I added to /etc/rc.d/rc.lm_sensors.conf)
From post #1, You have slackware i386, a 32 bit os loaded. Come back to what I posted earlier
MODULE_0=i2c-piix4
MODULE_1=lm90
MODULE_2=k8temp
MODULE_4=adm1021
I very much doubt if you have either the k8temp module, which would be fairly critical for you. Neither would you have any of the 64 bit processor extended features. My box is similar. ACPI doesn't work on my box on that kernel. I suggest a 64 bit OS.
Have you considered removing Slackware from the equation by booting the device with DOS and leaving the machine to see if it resets while it is sitting idle ? (ie. indicating that it is some kind of hardware error).
Just get the machine to boot and display the date and time and let it stand idle to see if it reboots unexpectedly.
Or run a live CD distro and see if the same fault occurs.
I created /etc/rc.d/rc.lm_sensors.conf and added the modules. The result of it was that sensors-detect actually gave me a 'yes'
Code:
Lastly, we can probe the I2C/SMBus adapters for connected hardware
monitoring devices. This is the most risky part, and while it works
reasonably well on most systems, it has been reported to cause trouble
on some systems.
Do you want to probe the I2C/SMBus adapters now? (YES/no): yes
Using driver `i2c-piix4' for device 0000:00:14.0: ATI Technologies Inc SB600 SMBus
Next adapter: SMBus PIIX4 adapter at 0b00 (i2c-0)
Do you want to scan it? (YES/no/selectively): yes
Client found at address 0x50
Probing for `Analog Devices ADM1033'... No
Probing for `Analog Devices ADM1034'... No
Probing for `SPD EEPROM'... Yes
(confidence 8, not a hardware monitoring chip)
Probing for `EDID EEPROM'... No
Client found at address 0x51
Probing for `Analog Devices ADM1033'... No
Probing for `Analog Devices ADM1034'... No
Probing for `SPD EEPROM'... Yes
(confidence 8, not a hardware monitoring chip)
Sorry, no sensors were detected.
This is relatively common on laptops, where thermal management is
handled by ACPI rather than the OS.
Still, /proc/acpi/thermal_zone is empty
Quote:
Have you considered removing Slackware from the equation by booting the device with DOS and leaving the machine to see if it resets while it is sitting idle ? (ie. indicating that it is some kind of hardware error).
Just get the machine to boot and display the date and time and let it stand idle to see if it reboots unexpectedly.
Or run a live CD distro and see if the same fault occurs.
It's been running ok for a few days now.
I'm going to try 64 bit as well.
Okay, you have an sb600 southbridge, same as me.
Get a 64 bit OS. I have Slamd64 and Fedora installed. Either works. I had a bad time with ubuntu, which basically didn't work.
It had so many issues out of the box I just ran away.
As for /proc/acpi/thermal_zone, there's a kernel option for creating decpacated (=obsolete) /proc files for old software. That stuff now should be in /sys
cd /sys && find -name '*thermal*'
Thanks business_kid.
I was going to try slackware64 anyway I installed it last night, did some configuration for a couple of hours and left it on. Before I went to work I checked it and it was fine. I'll see if it's ok after I get back from work.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.