Linux - Hardware This forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux? |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
 |
|
11-03-2020, 08:35 AM
|
#16
|
LQ Newbie
Registered: Oct 2020
Location: Ukraine
Distribution: Debian
Posts: 13
Original Poster
Rep:
|
I am running smartcl test right now on the original drive. Shutdowns still happened after e2fsck -pcfv. Also laptop-mode-tools CPU throttling maximized, ethernet throttling enabled, and HDD aggressive power management enabled.
|
|
|
11-03-2020, 10:50 AM
|
#17
|
LQ Newbie
Registered: Oct 2020
Location: Ukraine
Distribution: Debian
Posts: 13
Original Poster
Rep:
|
Here is smartctl output from the original drive. The temperature went up to 49C. With elevation it dropped to 38C as with the bad drive with same strict laptop-mode-tools settings. I am surprised it didn't shutdown. It used to be that it would reboot with gsmartctl running in minutes.
+++ it does show some errors but they weren't present when I first run the program. I fear that the errors appeared after I started switching hdds for experimentation purposes to track down the issue. I might have handled them wrongly.
https://www.linuxquestions.org/quest...1&d=1604418554
Last edited by vvbond; 11-03-2020 at 10:56 AM.
|
|
|
11-03-2020, 11:21 AM
|
#18
|
Member
Registered: Jan 2014
Location: Canton, Georgia, USA
Distribution: Debian 12
Posts: 205
Rep: 
|
I've never seen a hard drive shut down that easily. The Seagate ST9500325AS specs list tolerable temperatures for that model while operating as anywhere within the 0 to 60 degrees Celsius range.
That being said, I also have a Seagate 9500320 in an old Asus lappy and it started piling up read errors at 4.4K POH with random crashes.
Perhaps Seagate is simply a bad manufacturer if you want lasting equipment.
|
|
|
11-03-2020, 11:31 AM
|
#19
|
LQ Newbie
Registered: Oct 2020
Location: Ukraine
Distribution: Debian
Posts: 13
Original Poster
Rep:
|
It started suddenly around three months ago. Before that I run Debian on original Accer 250GB drive for a little less than two years and Arch Linux on the bad Asus 500GB drive for four years. Before that they were second-hand Windows installations.
And now it can't work for more than 15 minutes. Maybe it can work for an hour if I only browse text and compile small repo without installing any dev tools. That isn't reasonably usable.
I am not entierly sure it's as simple as buying a new hard drive either, even if I could afford it.
|
|
|
11-03-2020, 12:03 PM
|
#20
|
Senior Member
Registered: Aug 2016
Posts: 3,345
|
That WD drive reports recommended max temp as 60 and critical max at 85. It also reports lifetime max temp as 72, so I do not suspect the drive is failing like was shown clearly on the seagate drive with the errors. During the self test it reached a stable 49 and I would watch that but not be concerned.
I would watch the WD drive, periodically check status, and observe it cautiously for possible progressive failure.
|
|
|
11-03-2020, 12:14 PM
|
#21
|
Senior Member
Registered: Aug 2016
Posts: 3,345
|
Quote:
Originally Posted by vvbond
I am running smartcl test right now on the original drive. Shutdowns still happened after e2fsck -pcfv. Also laptop-mode-tools CPU throttling maximized, ethernet throttling enabled, and HDD aggressive power management enabled.
|
I think aggressive power management on HDD is overrated. The benefits of power savings vs drive lifetime reduction caused by repeated on / off cycles is questionable. I have never been one to place the cost of the trickle of power used above the cost of replacing the drive should it fail early. Almost all mechanical devices last for more operating hours when run continuously vs repeated start stop cycles. If you are truly worried about the amount of power used then an SSD would be of great benefit. The same applies to ethernet throttling, though my concern there is the delay in reconnecting after it has been shutdown to "save power".
However, the CPU throttling is certainly of benefit if there are temperature or power concerns, since the CPU and GPU are the greatest power hogs on most systems.
Last edited by computersavvy; 11-03-2020 at 12:18 PM.
|
|
1 members found this post helpful.
|
11-03-2020, 01:09 PM
|
#22
|
Senior Member
Registered: Dec 2010
Location: California, USA
Distribution: I run my own OS
Posts: 1,055
|
Quote:
Originally Posted by computersavvy
I think aggressive power management on HDD is overrated. The benefits of power savings vs drive lifetime reduction caused by repeated on / off cycles is questionable. I have never been one to place the cost of the trickle of power used above the cost of replacing the drive should it fail early. Almost all mechanical devices last for more operating hours when run continuously vs repeated start stop cycles. If you are truly worried about the amount of power used then an SSD would be of great benefit. The same applies to ethernet throttling, though my concern there is the delay in reconnecting after it has been shutdown to "save power".
|
I own a pair of the notoriously unreliable Seagate 3TB drives. Early on, I noticed that the power management would park the heads after a period of inactivity. This made an audible noise. I disabled the power management because of the noise, but I figured that doing so also avoided mechanical wear on the parking mechanism. These drives still work after six years.
The OP's drive has over 1M head load/unload cycles. The Seagate desktop drives are rated at 300K cycles.
Ed
Last edited by EdGr; 11-03-2020 at 02:29 PM.
Reason: looked up spec
|
|
|
11-03-2020, 03:57 PM
|
#23
|
LQ Newbie
Registered: Oct 2020
Location: Ukraine
Distribution: Debian
Posts: 13
Original Poster
Rep:
|
> I think aggressive power management on HDD is overrated. The benefits of power savings vs drive lifetime reduction caused by repeated on / off cycles is questionable.
I am aware. I am desperate to make it work at all. I can't really watch it if I can't use it for any work. Any torrenting, moving large files, frequent compilation or video playback, on any of those two drives, just triggers a shutdown.
I think it gets worse because smartctl daemon sent me four error messages to /var/mail/myuser over past month. I figured whatever those are the latest report covers that.
The only possibility I can think of that i haven't tried yet is to re-apply thermal paste to the CPU, maybe. Can you think of any more possible solutions to random shutdowns?
|
|
|
11-03-2020, 10:46 PM
|
#24
|
Senior Member
Registered: Aug 2016
Posts: 3,345
|
Quote:
Originally Posted by vvbond
I am aware. I am desperate to make it work at all. I can't really watch it if I can't use it for any work. Any torrenting, moving large files, frequent compilation or video playback, on any of those two drives, just triggers a shutdown.
I think it gets worse because smartctl daemon sent me four error messages to /var/mail/myuser over past month. I figured whatever those are the latest report covers that.
The only possibility I can think of that i haven't tried yet is to re-apply thermal paste to the CPU, maybe. Can you think of any more possible solutions to random shutdowns?
|
Everything you mentioned, in fact everything you have mentioned from the beginning demands CPU action. I had a PC that would intermittently shutdown unexpectedly like yours, and even though I monitored it closely with gkrellm I never was able to see the temp spike on the CPU that triggered it.
My system was water cooled and had been running flawlessly for over a year at about 98% cpu load with temps only about 50C.
What I finally figured out was that the pump on my cooling system quit. The temp was spiking so rapidly that gkrellm did not even have time to report the spike before it shutdown. This is why I asked you about the fans in your machine.
I don't know if new thermal paste will be enough but it certainly wont hurt and could give you a lot more stable life. Replacing the fans that are a decade old would not hurt either.
|
|
1 members found this post helpful.
|
11-05-2020, 03:59 AM
|
#25
|
Senior Member
Registered: Jan 2003
Location: Illinois (SW Chicago 'burbs)
Distribution: openSUSE, Raspbian, Slackware. Previous: MacOS, Red Hat, Coherent, Consensys SVR4.2, Tru64, Solaris
Posts: 2,849
|
You mentioned (if memory serves) that the CPU was running at full speed all the time... Is there a reason you need/want that? Heat build-up in a laptop can happen quickly. If you have the means to do it, I'd opt for "adaptive" settings for both CPU and GPU.
I tried ripping CDs on my laptop a few years back and gave up on it. The heat build-up when the drive ran for an extended period was enough to throttle the CPU down to a level where my desktop environment was barely usable. It didn't crash, though.
I tinkered with setting my desktop system's nVidia card in "high performance" mode some time ago and was surprised how quickly the temperature shot up. Around 15F-20F hotter than it runs normally (~125F). CPU temps hover around 100F. If that were to be happening in a laptop where there's far less airflow to help shed that heat... Well, you know.
|
|
|
11-06-2020, 09:37 AM
|
#26
|
LQ Newbie
Registered: Oct 2020
Location: Ukraine
Distribution: Debian
Posts: 13
Original Poster
Rep:
|
> You mentioned (if memory serves) that the CPU was running at full speed all the time... Is there a reason you need/want that?
In the past I tried to disable GPU completely by making sure there is no firmware and using "nomodeset" kernel property.
I will gladly throttle CPU to make the system work. How do I do that? **laptop-mode-tools** did not work. I made sure to enable CPU throttling and set it to aggressive. Additionally, I tried running it with mem=1024M kernel property. That did not solve the problem either.
Last edited by vvbond; 11-06-2020 at 09:38 AM.
|
|
|
11-08-2020, 10:12 AM
|
#27
|
Member
Registered: Aug 2018
Location: Brendansport, Sagitta IV
Distribution: PCLinuxOS
Posts: 168
Rep:
|
Specifically that torrenting causes a thermal shutdown, perhaps related:
I have a 13 year old quadcore that runs XP64 and does duty as a fileserver, and perhaps because it was a very early quadcore board, the whole thing runs hot (up around 65-80C). It works fine otherwise, but torrents written to a USB external hard drive cause the southbridge to achieve a sustained temperature of (are you sitting down?) 110C.
Now, because one of the external drives refuses to stay awake, I hit on the crude solution of letting WinAmp silently run a playlist from it continuously so it gets joggled every 3 or 4 minutes. This does NOT cause the high temperatures that writing a torrent does. (Nor do I see high temps when I write torrents to one of the internally-connected HDs.)
Writing a single very large file to the external drive likewise does not cause high temperatures.
As I don't see this problem on my other hardware, I've concluded that it's probably a flaw in that PC's write caching via USB; when that's repeatedly started and stopped (as with a torrent), it gets hot. So how does that relate to your problem?
Historically, linux had poor to absent disk caching. It's gotten much better but I remember when there was effectively none, causing HDs that lacked hardware cache (like older Seagates) to get hot and perform at glacial speeds. So I'm thinking that this really boils down to current linux driver vs old I/O subsystem, which causes the HD to work overtime and heat up. Windows has better support for old hardware, and historically had far better disk caching, so there you don't see the problem.
So I think I would experiment with different kernels and see if you can determine an era where your HD temperatures change significantly, as this really sounds like linux driver vs hardware. That one drive is on its way out, I suspect is unrelated.
I'd also be curious to compare my own preferred distro (PCLinuxOS/KDE) as I run it on a laptop that's about 12 years old and so far have seen no issues, but it doesn't do torrents either. It's rolling and default is kernel latest, for whatever that's worth.
|
|
1 members found this post helpful.
|
11-09-2020, 08:02 AM
|
#28
|
LQ Newbie
Registered: Oct 2020
Location: Ukraine
Distribution: Debian
Posts: 13
Original Poster
Rep:
|
I suspect it's rather hardware damage, since I run Debian 10 without issue on that machine for a year.
I also tried **thinkfan** to maybe make the fan run continuously. It didn't help. The arcane configuration format doesn't help either. If someone could help me out with that to make **thinkfan** spin my fan for as long as the machine runs, that would be helpful.
I noticed that if I plug an Android phone into the laptop USB slot before or during Debian booting, the machine will reboot without completing. I will maybe look closer into logs for this. Frankly I am giving up.
Yesterday Windows refused to boot. GRUB loads fine, then begins Windows-specific loading, and after around 15 seconds the machine reboots. The machine was cold, left in a room with open window for ten hours in late autumn. After five futile attempts to boot, in desperation I just lifted the laptop with my hands in he air as to allow better air circulation for the fan. Again, normally it's elevated and has panel removed and was cleaned recently. That trick worked, Windows booted.
|
|
|
11-09-2020, 11:01 AM
|
#29
|
Member
Registered: Aug 2018
Location: Brendansport, Sagitta IV
Distribution: PCLinuxOS
Posts: 168
Rep:
|
Quote:
Originally Posted by vvbond
I suspect it's rather hardware damage, since I run Debian 10 without issue on that machine for a year.
|
Okay, I missed that. But point stands that linux is less tolerant of hardware that is not 100% to its liking, while Windows will usually stagger along regardless.
And given the info in your latest post, most likely it's bad capacitors in the southbridge circuit.
Related: There's a design problem common in older Asus desktop boards (and possibly others) where the southbridge gets too much juice and slowly cooks the associated capacitors. I've seen it 3 times now. First symptom is USB either stops working or plugging in a USB device locks up the whole system. Might be if it manages to bypass the capacitor, it makes a short and overheats instead, proportional to how busy the circuit is, and that's what you're seeing.
|
|
1 members found this post helpful.
|
All times are GMT -5. The time now is 10:19 PM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|