LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware
User Name
Password
Linux - Hardware This forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?

Notices


Reply
  Search this Thread
Old 11-03-2020, 08:35 AM   #16
vvbond
LQ Newbie
 
Registered: Oct 2020
Location: Ukraine
Distribution: Debian
Posts: 13

Original Poster
Rep: Reputation: 0

I am running smartcl test right now on the original drive. Shutdowns still happened after e2fsck -pcfv. Also laptop-mode-tools CPU throttling maximized, ethernet throttling enabled, and HDD aggressive power management enabled.
 
Old 11-03-2020, 10:50 AM   #17
vvbond
LQ Newbie
 
Registered: Oct 2020
Location: Ukraine
Distribution: Debian
Posts: 13

Original Poster
Rep: Reputation: 0
Here is smartctl output from the original drive. The temperature went up to 49C. With elevation it dropped to 38C as with the bad drive with same strict laptop-mode-tools settings. I am surprised it didn't shutdown. It used to be that it would reboot with gsmartctl running in minutes.

+++ it does show some errors but they weren't present when I first run the program. I fear that the errors appeared after I started switching hdds for experimentation purposes to track down the issue. I might have handled them wrongly.

https://www.linuxquestions.org/quest...1&d=1604418554
Attached Files
File Type: txt WDC_WD2500BEVS-22UST0_WD-WXC108513358_2020-11-03.txt (15.3 KB, 12 views)

Last edited by vvbond; 11-03-2020 at 10:56 AM.
 
Old 11-03-2020, 11:21 AM   #18
RickDeckard
Member
 
Registered: Jan 2014
Location: Canton, Georgia, USA
Distribution: Debian 12
Posts: 205

Rep: Reputation: Disabled
I've never seen a hard drive shut down that easily. The Seagate ST9500325AS specs list tolerable temperatures for that model while operating as anywhere within the 0 to 60 degrees Celsius range.

That being said, I also have a Seagate 9500320 in an old Asus lappy and it started piling up read errors at 4.4K POH with random crashes.

Perhaps Seagate is simply a bad manufacturer if you want lasting equipment.
 
Old 11-03-2020, 11:31 AM   #19
vvbond
LQ Newbie
 
Registered: Oct 2020
Location: Ukraine
Distribution: Debian
Posts: 13

Original Poster
Rep: Reputation: 0
It started suddenly around three months ago. Before that I run Debian on original Accer 250GB drive for a little less than two years and Arch Linux on the bad Asus 500GB drive for four years. Before that they were second-hand Windows installations.

And now it can't work for more than 15 minutes. Maybe it can work for an hour if I only browse text and compile small repo without installing any dev tools. That isn't reasonably usable.

I am not entierly sure it's as simple as buying a new hard drive either, even if I could afford it.
 
Old 11-03-2020, 12:03 PM   #20
computersavvy
Senior Member
 
Registered: Aug 2016
Posts: 3,345

Rep: Reputation: 1486Reputation: 1486Reputation: 1486Reputation: 1486Reputation: 1486Reputation: 1486Reputation: 1486Reputation: 1486Reputation: 1486Reputation: 1486
That WD drive reports recommended max temp as 60 and critical max at 85. It also reports lifetime max temp as 72, so I do not suspect the drive is failing like was shown clearly on the seagate drive with the errors. During the self test it reached a stable 49 and I would watch that but not be concerned.

I would watch the WD drive, periodically check status, and observe it cautiously for possible progressive failure.
 
Old 11-03-2020, 12:14 PM   #21
computersavvy
Senior Member
 
Registered: Aug 2016
Posts: 3,345

Rep: Reputation: 1486Reputation: 1486Reputation: 1486Reputation: 1486Reputation: 1486Reputation: 1486Reputation: 1486Reputation: 1486Reputation: 1486Reputation: 1486
Quote:
Originally Posted by vvbond View Post
I am running smartcl test right now on the original drive. Shutdowns still happened after e2fsck -pcfv. Also laptop-mode-tools CPU throttling maximized, ethernet throttling enabled, and HDD aggressive power management enabled.
I think aggressive power management on HDD is overrated. The benefits of power savings vs drive lifetime reduction caused by repeated on / off cycles is questionable. I have never been one to place the cost of the trickle of power used above the cost of replacing the drive should it fail early. Almost all mechanical devices last for more operating hours when run continuously vs repeated start stop cycles. If you are truly worried about the amount of power used then an SSD would be of great benefit. The same applies to ethernet throttling, though my concern there is the delay in reconnecting after it has been shutdown to "save power".

However, the CPU throttling is certainly of benefit if there are temperature or power concerns, since the CPU and GPU are the greatest power hogs on most systems.

Last edited by computersavvy; 11-03-2020 at 12:18 PM.
 
1 members found this post helpful.
Old 11-03-2020, 01:09 PM   #22
EdGr
Senior Member
 
Registered: Dec 2010
Location: California, USA
Distribution: I run my own OS
Posts: 1,055

Rep: Reputation: 492Reputation: 492Reputation: 492Reputation: 492Reputation: 492
Quote:
Originally Posted by computersavvy View Post
I think aggressive power management on HDD is overrated. The benefits of power savings vs drive lifetime reduction caused by repeated on / off cycles is questionable. I have never been one to place the cost of the trickle of power used above the cost of replacing the drive should it fail early. Almost all mechanical devices last for more operating hours when run continuously vs repeated start stop cycles. If you are truly worried about the amount of power used then an SSD would be of great benefit. The same applies to ethernet throttling, though my concern there is the delay in reconnecting after it has been shutdown to "save power".
I own a pair of the notoriously unreliable Seagate 3TB drives. Early on, I noticed that the power management would park the heads after a period of inactivity. This made an audible noise. I disabled the power management because of the noise, but I figured that doing so also avoided mechanical wear on the parking mechanism. These drives still work after six years.

The OP's drive has over 1M head load/unload cycles. The Seagate desktop drives are rated at 300K cycles.
Ed

Last edited by EdGr; 11-03-2020 at 02:29 PM. Reason: looked up spec
 
Old 11-03-2020, 03:57 PM   #23
vvbond
LQ Newbie
 
Registered: Oct 2020
Location: Ukraine
Distribution: Debian
Posts: 13

Original Poster
Rep: Reputation: 0
> I think aggressive power management on HDD is overrated. The benefits of power savings vs drive lifetime reduction caused by repeated on / off cycles is questionable.

I am aware. I am desperate to make it work at all. I can't really watch it if I can't use it for any work. Any torrenting, moving large files, frequent compilation or video playback, on any of those two drives, just triggers a shutdown.

I think it gets worse because smartctl daemon sent me four error messages to /var/mail/myuser over past month. I figured whatever those are the latest report covers that.

The only possibility I can think of that i haven't tried yet is to re-apply thermal paste to the CPU, maybe. Can you think of any more possible solutions to random shutdowns?
 
Old 11-03-2020, 10:46 PM   #24
computersavvy
Senior Member
 
Registered: Aug 2016
Posts: 3,345

Rep: Reputation: 1486Reputation: 1486Reputation: 1486Reputation: 1486Reputation: 1486Reputation: 1486Reputation: 1486Reputation: 1486Reputation: 1486Reputation: 1486
Quote:
Originally Posted by vvbond View Post
I am aware. I am desperate to make it work at all. I can't really watch it if I can't use it for any work. Any torrenting, moving large files, frequent compilation or video playback, on any of those two drives, just triggers a shutdown.

I think it gets worse because smartctl daemon sent me four error messages to /var/mail/myuser over past month. I figured whatever those are the latest report covers that.

The only possibility I can think of that i haven't tried yet is to re-apply thermal paste to the CPU, maybe. Can you think of any more possible solutions to random shutdowns?
Everything you mentioned, in fact everything you have mentioned from the beginning demands CPU action. I had a PC that would intermittently shutdown unexpectedly like yours, and even though I monitored it closely with gkrellm I never was able to see the temp spike on the CPU that triggered it.

My system was water cooled and had been running flawlessly for over a year at about 98% cpu load with temps only about 50C.
What I finally figured out was that the pump on my cooling system quit. The temp was spiking so rapidly that gkrellm did not even have time to report the spike before it shutdown. This is why I asked you about the fans in your machine.

I don't know if new thermal paste will be enough but it certainly wont hurt and could give you a lot more stable life. Replacing the fans that are a decade old would not hurt either.
 
1 members found this post helpful.
Old 11-05-2020, 03:59 AM   #25
rnturn
Senior Member
 
Registered: Jan 2003
Location: Illinois (SW Chicago 'burbs)
Distribution: openSUSE, Raspbian, Slackware. Previous: MacOS, Red Hat, Coherent, Consensys SVR4.2, Tru64, Solaris
Posts: 2,849

Rep: Reputation: 553Reputation: 553Reputation: 553Reputation: 553Reputation: 553Reputation: 553
You mentioned (if memory serves) that the CPU was running at full speed all the time... Is there a reason you need/want that? Heat build-up in a laptop can happen quickly. If you have the means to do it, I'd opt for "adaptive" settings for both CPU and GPU.

I tried ripping CDs on my laptop a few years back and gave up on it. The heat build-up when the drive ran for an extended period was enough to throttle the CPU down to a level where my desktop environment was barely usable. It didn't crash, though.

I tinkered with setting my desktop system's nVidia card in "high performance" mode some time ago and was surprised how quickly the temperature shot up. Around 15F-20F hotter than it runs normally (~125F). CPU temps hover around 100F. If that were to be happening in a laptop where there's far less airflow to help shed that heat... Well, you know.
 
Old 11-06-2020, 09:37 AM   #26
vvbond
LQ Newbie
 
Registered: Oct 2020
Location: Ukraine
Distribution: Debian
Posts: 13

Original Poster
Rep: Reputation: 0
> You mentioned (if memory serves) that the CPU was running at full speed all the time... Is there a reason you need/want that?

In the past I tried to disable GPU completely by making sure there is no firmware and using "nomodeset" kernel property.

I will gladly throttle CPU to make the system work. How do I do that? **laptop-mode-tools** did not work. I made sure to enable CPU throttling and set it to aggressive. Additionally, I tried running it with mem=1024M kernel property. That did not solve the problem either.

Last edited by vvbond; 11-06-2020 at 09:38 AM.
 
Old 11-08-2020, 10:12 AM   #27
Reziac
Member
 
Registered: Aug 2018
Location: Brendansport, Sagitta IV
Distribution: PCLinuxOS
Posts: 168

Rep: Reputation: 48
Specifically that torrenting causes a thermal shutdown, perhaps related:

I have a 13 year old quadcore that runs XP64 and does duty as a fileserver, and perhaps because it was a very early quadcore board, the whole thing runs hot (up around 65-80C). It works fine otherwise, but torrents written to a USB external hard drive cause the southbridge to achieve a sustained temperature of (are you sitting down?) 110C.

Now, because one of the external drives refuses to stay awake, I hit on the crude solution of letting WinAmp silently run a playlist from it continuously so it gets joggled every 3 or 4 minutes. This does NOT cause the high temperatures that writing a torrent does. (Nor do I see high temps when I write torrents to one of the internally-connected HDs.)

Writing a single very large file to the external drive likewise does not cause high temperatures.

As I don't see this problem on my other hardware, I've concluded that it's probably a flaw in that PC's write caching via USB; when that's repeatedly started and stopped (as with a torrent), it gets hot. So how does that relate to your problem?

Historically, linux had poor to absent disk caching. It's gotten much better but I remember when there was effectively none, causing HDs that lacked hardware cache (like older Seagates) to get hot and perform at glacial speeds. So I'm thinking that this really boils down to current linux driver vs old I/O subsystem, which causes the HD to work overtime and heat up. Windows has better support for old hardware, and historically had far better disk caching, so there you don't see the problem.

So I think I would experiment with different kernels and see if you can determine an era where your HD temperatures change significantly, as this really sounds like linux driver vs hardware. That one drive is on its way out, I suspect is unrelated.

I'd also be curious to compare my own preferred distro (PCLinuxOS/KDE) as I run it on a laptop that's about 12 years old and so far have seen no issues, but it doesn't do torrents either. It's rolling and default is kernel latest, for whatever that's worth.
 
1 members found this post helpful.
Old 11-09-2020, 08:02 AM   #28
vvbond
LQ Newbie
 
Registered: Oct 2020
Location: Ukraine
Distribution: Debian
Posts: 13

Original Poster
Rep: Reputation: 0
I suspect it's rather hardware damage, since I run Debian 10 without issue on that machine for a year.

I also tried **thinkfan** to maybe make the fan run continuously. It didn't help. The arcane configuration format doesn't help either. If someone could help me out with that to make **thinkfan** spin my fan for as long as the machine runs, that would be helpful.

I noticed that if I plug an Android phone into the laptop USB slot before or during Debian booting, the machine will reboot without completing. I will maybe look closer into logs for this. Frankly I am giving up.

Yesterday Windows refused to boot. GRUB loads fine, then begins Windows-specific loading, and after around 15 seconds the machine reboots. The machine was cold, left in a room with open window for ten hours in late autumn. After five futile attempts to boot, in desperation I just lifted the laptop with my hands in he air as to allow better air circulation for the fan. Again, normally it's elevated and has panel removed and was cleaned recently. That trick worked, Windows booted.
 
Old 11-09-2020, 11:01 AM   #29
Reziac
Member
 
Registered: Aug 2018
Location: Brendansport, Sagitta IV
Distribution: PCLinuxOS
Posts: 168

Rep: Reputation: 48
Quote:
Originally Posted by vvbond View Post
I suspect it's rather hardware damage, since I run Debian 10 without issue on that machine for a year.
Okay, I missed that. But point stands that linux is less tolerant of hardware that is not 100% to its liking, while Windows will usually stagger along regardless.

And given the info in your latest post, most likely it's bad capacitors in the southbridge circuit.

Related: There's a design problem common in older Asus desktop boards (and possibly others) where the southbridge gets too much juice and slowly cooks the associated capacitors. I've seen it 3 times now. First symptom is USB either stops working or plugging in a USB device locks up the whole system. Might be if it manages to bypass the capacitor, it makes a short and overheats instead, proportional to how busy the circuit is, and that's what you're seeing.
 
1 members found this post helpful.
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Laptop's hard drive overheats PrinceCruise Linux - Laptop and Netbook 7 03-29-2014 11:17 AM
Gimp 2.8.2, a question specifically pertaining to the GNU image manipulation program. stf92 General 6 10-25-2013 08:52 PM
[SOLVED] Gimp, a question specifically pertaining to the GNU image manipulation program. stf92 Linux - Software 7 09-25-2013 03:09 AM
Toshiba Satellite A65 Dual Boot Overheats with Fedora only jpc1258 Linux - Laptop and Netbook 4 01-20-2005 04:10 PM
laptop overheats... riddlebox80 Linux - Hardware 4 01-28-2003 09:21 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware

All times are GMT -5. The time now is 10:19 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration