LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware
User Name
Password
Linux - Hardware This forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?

Notices


Reply
  Search this Thread
Old 06-20-2022, 11:27 AM   #1
Jason_25
Member
 
Registered: Nov 2001
Posts: 183

Rep: Reputation: 23
Anatomy of an SSD failure


I just experienced the end of a slow breakdown of an SSD.

This SSD has been installed in a media PC for 6 years. The first symptoms came early on with hangups that I thought were due to memory pressure. The OS would freeze for a time until I killed the browser so you can see how the problem looks memory related. On the 7th this month SMART started reporting the drive as bad. I noticed it a few days later on the 15th when the console reported "new mail". At first I thought it was a false alarm but then the hangups became progressively worse. Now the drive does not respond at all. No real data loss as this is a media PC.

I have included the smartctl report from the failed drive and a smartctl report from an identical drive that is still working.

Interestingly, smartctl reports no failed attributes found. But it also says the overall health assessment has failed.

I have also included a picture of the drive internals after I have opened it up. Nothing on the front or back of the board appears to be burnt. It is suprising how small the circuit board is and how little is inside these Sandisk drives.

One thing of note is that the retired block count, reserved block count and unknown sandforce attribute are higher on the failed drive than the working drive. Power on hours are much higher as well as the amount of data read and written to and from the drive. Clearly, the media usage of streaming videos took a heavy toll on the drive.

Notice how few power cycles are on both drives as I do not like equipment going from cold to hot and back again. I have had too many desktop PCs in the past not turn on due to a bad motherboard or power supply after running for a long time.

I use Samsung and Sandisk SSD drives. Sandisk is still an ok brand to me. On one hand the performance of the drive and the tiny circuit board inside compared to a Samsung drive is cause for concern. On the other hand I have reason to suspect that physical stress on the SSD when I was installing it played into the failure. The tablet-like laptop it was installed in was a very complicated fit and I was frustrated when installing it and likely damaged it.

I hope this is interesting to someone. I wonder why smartctl simultaneously shows "no failed attributes" but also shows the overall health as failed?
Attached Thumbnails
Click image for larger version

Name:	IMG_20220620_1153338.jpg
Views:	41
Size:	252.4 KB
ID:	39103  
Attached Files
File Type: txt DRIVE_FAILURE.txt (5.4 KB, 26 views)
File Type: txt DRIVE_PASS.txt (4.7 KB, 17 views)
 
Old 06-20-2022, 12:56 PM   #2
fatmac
LQ Guru
 
Registered: Sep 2011
Location: Upper Hale, Surrey/Hants Border, UK
Distribution: One main distro, & some smaller ones casually.
Posts: 5,867

Rep: Reputation: Disabled
I have a bare disk just like that picture of your internals, it is called a DoM (Disk on Module), it came fitted in a thin client that I bought pre used.

I think they just put one inside a 2.5" case for convenience, perhaps all SSD are the same, I've never opened one up.

Last edited by fatmac; 06-20-2022 at 12:59 PM.
 
Old 06-22-2022, 07:13 PM   #3
Jason_25
Member
 
Registered: Nov 2001
Posts: 183

Original Poster
Rep: Reputation: 23
That is pretty interesting. It saves a lot of space like that. I suspect there is a little more inside the Samsung 2.5" SSD drives because they are noticeably heavier and to me perform a little bit better. I suppose this discussion is increasingly irrelevant now that the mass of people are moving toward m.2 drives. The replacement drive in the system is performing really well.
 
Old 06-22-2022, 09:19 PM   #4
jefro
Moderator
 
Registered: Mar 2008
Posts: 22,361

Rep: Reputation: 3692Reputation: 3692Reputation: 3692Reputation: 3692Reputation: 3692Reputation: 3692Reputation: 3692Reputation: 3692Reputation: 3692Reputation: 3692Reputation: 3692
Many systems were not built with a SSD so there is also a lot of timing problems.

SSD's and usb's have an oddity that makes them slow down too.
 
Old 06-24-2022, 07:40 AM   #5
business_kid
LQ Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 17,543

Rep: Reputation: 2608Reputation: 2608Reputation: 2608Reputation: 2608Reputation: 2608Reputation: 2608Reputation: 2608Reputation: 2608Reputation: 2608Reputation: 2608Reputation: 2608
Welcome to my world.

I fixed hardware for a living. Let me assure you that blown ICs or faulty ICs look exactly the same as good ones. For chips blown at the on the I/O legs (Most chips in industrial boards) you could use Analogue Signature Analysis to probe the legs, but with MSI/LSI chips the only viable option was functional testing.

It's all impossible now, because you can't get the parts.
 
Old 06-24-2022, 08:12 AM   #6
michaelk
Moderator
 
Registered: Aug 2002
Posts: 26,757

Rep: Reputation: 6318Reputation: 6318Reputation: 6318Reputation: 6318Reputation: 6318Reputation: 6318Reputation: 6318Reputation: 6318Reputation: 6318Reputation: 6318Reputation: 6318
The two parameters that might of triggered the fail are:
[code]
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
5 Retired_Block_Count 0x0032 100 100 000 Old_age Always - 341
232 Available_Reservd_Space 0x0033 100 100 004 Pre-fail Always - 14708775[/quote]

Available Reserved Space for the SanDisk from what I can tell starts at 100 and decreases.
 
Old 06-24-2022, 10:52 AM   #7
anon286
Member
 
Registered: Jun 2016
Location: UK
Posts: 132

Rep: Reputation: Disabled
Not relevant, but I have Dimension 4600 running a twenty year old drive, and that computer was on every day, and it still works, nearly twenty years on. I don't get it. Nothing has been touched except for RAM changed in 2008, upgraded graphics in 2010, and battery change couple of times in the past eleven years. Even the mouse, mouse mat, and screen are gone. All that is left is the keyboard with it. LOL

I only have it for Microsoft flight simulator 2004, Elite force, Men in Black the game.

Some hardware is built and lasts longer and others pack up after a few years or several.
 
Old 06-25-2022, 03:56 AM   #8
business_kid
LQ Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 17,543

Rep: Reputation: 2608Reputation: 2608Reputation: 2608Reputation: 2608Reputation: 2608Reputation: 2608Reputation: 2608Reputation: 2608Reputation: 2608Reputation: 2608Reputation: 2608
I had an instructor in 1977 who told us of a test he started in 1961. They put NPN & PNP transistors on switching various loads regularly 24/7/365. 16 years later, it was substantially the same transistors doing the job, although a few had been replaced. At the time, this was a sharp contrast with thermionic valves, which would all would have weakened and nearly all have died. The weak link in your 20 year old box will be the hard disk, with it's moving, mechanical & magnetic parts.
 
Old 06-25-2022, 12:23 PM   #9
anon286
Member
 
Registered: Jun 2016
Location: UK
Posts: 132

Rep: Reputation: Disabled
I thought so. How ever that doesn't stop capacitors from going knackered, the optiplex GX270 small form factor, a 2001 dell system has a floppy disk, has a few bad capacitors, but it runs. I shouldn't of installed or attempted to install linux, as a result it became useless. I can't even run vice city, integrated graphics, I even tried the dell resource CD. But it isn't any good, but yes, erm, I guess the drive is more useful in that one for the 4600.

I can't play Mafia on the 4600, I probably should of done reinstalls of xp, due to the software being difficult to get on the web for certain software use. In the end it is pretty much old news.

But they all have their uses.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Failure after failure after failure.....etc 69Rixter Linux - Laptop and Netbook 5 04-14-2015 09:58 AM
LXer: This week at LWN: SELF: Anatomy of an (alleged) failure LXer Syndicated Linux News 0 07-06-2010 03:20 PM
LXer: Discover the Anatomy of initrd LXer Syndicated Linux News 0 08-05-2006 10:33 PM
LXer: The Word Trojan: Anatomy of an On-Line Story LXer Syndicated Linux News 0 05-27-2006 11:33 PM
Anatomy of a well-intentioned Linux Troll aysiu General 16 08-23-2005 07:17 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware

All times are GMT -5. The time now is 11:31 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration