LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware
User Name
Password
Linux - Hardware This forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?

Notices


Reply
  Search this Thread
Old 11-29-2020, 09:38 PM   #1
k.yepes
LQ Newbie
 
Registered: Nov 2020
Distribution: mint
Posts: 25

Rep: Reputation: Disabled
New SSD going read-only at random times about once a month.


I have a PC that I built in 2010 that had 3 different old-style mechanical rotating hard drives for the last 10 years. A few months ago I thought it would be wise to replace the main drive (which my OS is installed on - I use Linux mint, I think it's version 20?). I bought one of the samsung evo 500gb SSDs.

I did this mostly because I didn't want to deal with O.S. drive crashing - this computer is my "daily driver". I cloned the Old style mechanical HD with clonezilla onto the SSD. When I was done I noticed a huge increase in performance.

A few days went by, and I noticed that a bunch of things stopped working and behaved strangely. I found that this was because the drive suddenly became 'read only' when I was using it. (I can't remember what I was doing when I discovered this). I figured I'd reboot- and see if the problem went away. When I rebooted I could not get past GRUB. It would only give me an INITRAMFS prompt.

I found this link- which I followed the instructions (this fixed it)
https://askubuntu.com/questions/1243...n-ubuntu-20-04

This fixed worked for about a month or so, and one day when I fell asleep watching youtube I noticed that the video was stuck "loading" when I woke back up after about an hour or so napping. The same thing had happened - the drive went read-only. Being that I dealt with this before, I threw in a linux live ISO DVD and fixed it just like before.

About another month went by, and today this happened again. Today I was copying some files (from a DIFFERENT hard drive inside of the same PC to my laptop via SSH) and I went into my garage to do some work because it was going to take about an hour or 2. I came back and noticed that the file system had become read only again.

I am at work now, so I figured I'd use my spare time to research this and ask questions because I do not know what is causing this. I am worried that I may have received a bad SSD. (I bought it from NEWEGG). I don't want to keep operating this way.
 
Old 11-30-2020, 09:08 AM   #2
fatmac
LQ Guru
 
Registered: Sep 2011
Location: Upper Hale, Surrey/Hants Border, UK
Distribution: One main distro, & some smaller ones casually.
Posts: 5,744

Rep: Reputation: Disabled
Something is obviously corrupting your file systems, find out what & problem solved.
(Perhaps there is a bug in Ubuntu?)
 
Old 11-30-2020, 10:11 AM   #3
kilgoretrout
Senior Member
 
Registered: Oct 2003
Posts: 3,010

Rep: Reputation: 396Reputation: 396Reputation: 396Reputation: 396
Please post the output of:
Code:
$ systemctl status fstrim.timer
Since you cloned from a standard hard drive, I'm curious as to whether the ssd trim script, fstrim.timer, is being periodically run. It needs to be for an SSD or performance will deteriorate and it may not be running since you originally installed to a standard hard drive. If this is the cause, the next time the problem arises run:
Code:
$ sudo fstrim -a -v
That will manually run fstrim on the drive and if that corrects the problem, you have found the culprit. Also, post the output of:
Code:
$ lsblk -f
and:
Code:
$ df -h
That will give some idea of your partition layout, filesystems used and how full your drive is.
 
Old 11-30-2020, 10:35 AM   #4
uteck
Senior Member
 
Registered: Oct 2003
Location: Elgin,IL,USA
Distribution: KDE Neon
Posts: 1,244

Rep: Reputation: 520Reputation: 520Reputation: 520Reputation: 520Reputation: 520Reputation: 520
Since your mobo is 10 years old I would check if any of the capacitors on it are failing as the chemicals inside react because they were cheaply made.
Look for any that have a rust-like substances on or around them, and for any that are rounding on the top. They should be flat and as the chemicals react cause them to bulge at the top.
As the capacitors fail, they stop regulating the voltage of various sub-systems on the board and cause errors of various sorts.
 
Old 11-30-2020, 08:03 PM   #5
k.yepes
LQ Newbie
 
Registered: Nov 2020
Distribution: mint
Posts: 25

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by uteck View Post
Since your mobo is 10 years old I would check if any of the capacitors on it are failing as the chemicals inside react because they were cheaply made.
Look for any that have a rust-like substances on or around them, and for any that are rounding on the top. They should be flat and as the chemicals react cause them to bulge at the top.
As the capacitors fail, they stop regulating the voltage of various sub-systems on the board and cause errors of various sorts.
I open the computer probably twice a year and clean out everything with compressed air. I also build my own circuitboards by hand so I think I would probably notice this. I will pay closer attention to it next time though. Strange thing is that this only started when I changed to SSD from old-school mechanical HD.

Quote:
Originally Posted by kilgoretrout View Post
Please post the output of:
Code:
$ systemctl status fstrim.timer
Since you cloned from a standard hard drive, I'm curious as to whether the ssd trim script, fstrim.timer, is being periodically run. It needs to be for an SSD or performance will deteriorate and it may not be running since you originally installed to a standard hard drive. If this is the cause, the next time the problem arises run:
Code:
$ sudo fstrim -a -v
That will manually run fstrim on the drive and if that corrects the problem, you have found the culprit. Also, post the output of:
Code:
$ lsblk -f
and:
Code:
$ df -h
That will give some idea of your partition layout, filesystems used and how full your drive is.

Thank you very much for this suggestion. I will do this next time I experience the issue.



Just for grins & giggles, I took these photos real quick with my cell phone when I logged in via a live ISO DVD and did my fsck thing. I keep hitting "Y" as it was talking to me.

Now that I think of it, I could have copy-pasted the screen and saved it on one of the other drives (SDB or SDC)

https://i.imgur.com/jUHx8Uw.jpg
https://i.imgur.com/ZH7x2XZ.jpg
 
Old 12-01-2020, 10:08 PM   #6
kilgoretrout
Senior Member
 
Registered: Oct 2003
Posts: 3,010

Rep: Reputation: 396Reputation: 396Reputation: 396Reputation: 396
You might want to double check your cabling; maybe swap out your data and power cables if you have some available.I take it you didn't have any hard drive problems before you made the switch to the samsung ssd. You may have had some problem with the clone or moving the cables around caused some issue. You may also want to reseat your ram and check it with memtest. Maybe something got bumped.

The first command I gave you will tell you if your fstrim.timer is enabled. If it is, then you probably don't have an fstrim problem as the timer will automatically run fstrim at set intervals.

Your screenshots definitely show a corrupted filesystem which is corrected by fsck. You may also want to monitor the condition of your ssd with the smart tools available for linux with the smartmontools package. See:

https://www.linux.com/topic/desktop/...-health-smart/

and

https://linuxhandbook.com/check-ssd-health/

If the drive is failing or showing signs of problems with the smart tests, I'd return it to newegg.
 
Old 12-02-2020, 02:56 PM   #7
k.yepes
LQ Newbie
 
Registered: Nov 2020
Distribution: mint
Posts: 25

Original Poster
Rep: Reputation: Disabled
user1@machine:~$ systemctl status fstrim.timer
● fstrim.timer - Discard unused blocks once a week
Loaded: loaded (/lib/systemd/system/fstrim.timer; enabled; vendor preset: enabled)
Active: active (waiting) since Wed 2020-12-02 12:45:43 CST; 1h 9min ago
Trigger: Mon 2020-12-07 00:00:00 CST; 4 days left
Triggers: ● fstrim.service
Docs: man:fstrim

Dec 02 12:45:43 minarae systemd[1]: Started Discard unused blocks once a week.
user1@machine:~$
 
Old 12-02-2020, 03:01 PM   #8
k.yepes
LQ Newbie
 
Registered: Nov 2020
Distribution: mint
Posts: 25

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by kilgoretrout View Post
.... You may have had some problem with the clone or moving the cables around caused some issue. You may also want to reseat your ram and check it with memtest. Maybe something got bumped......

The only thing that was frustrating during cloning was thee fact that the actual size of my old mechanical hard drive compared to my new SSD was different. They were both labeled as "500GB" but however in reality the SSD was 1 megabyte smaller than the mechanical one. I ended up shrinking one of the partitions on the mechanical drive with GPARTED by a few MB just to get it to work.
 
Old 12-02-2020, 03:12 PM   #9
JockVSJock
Senior Member
 
Registered: Jan 2004
Posts: 1,420
Blog Entries: 4

Rep: Reputation: 164Reputation: 164
Just curious, does your SSD in Mint Linux live under /dev/sda1?

If yes, can you show us the output of the following

Code:
smartctl -a /dev/sda1
 
Old 12-02-2020, 03:18 PM   #10
k.yepes
LQ Newbie
 
Registered: Nov 2020
Distribution: mint
Posts: 25

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by JockVSJock View Post
Just curious, does your SSD in Mint Linux live under /dev/sda1?

If yes, can you show us the output of the following

Code:
smartctl -a /dev/sda1
It's sda6...

Code:
user1@machine:~$ sudo smartctl -a /dev/sda6
[sudo] password for user1:            
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-56-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Samsung based SSDs
Device Model:     Samsung SSD 860 EVO 500GB
Serial Number:    S598NZFN800834J
LU WWN Device Id: 5 002538 ec08d51c0
Firmware Version: RVT04B6Q
User Capacity:    500,107,862,016 bytes [500 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-4 T13/BSR INCITS 529 revision 5
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 1.5 Gb/s)
Local Time is:    Wed Dec  2 14:16:59 2020 CST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(    0) seconds.
Offline data collection
capabilities: 			 (0x53) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					No Offline surface scan supported.
					Self-test supported.
					No Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 (  85) minutes.
SCT capabilities: 	       (0x003d)	SCT Status supported.
					SCT Error Recovery Control supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0032   099   099   000    Old_age   Always       -       879
 12 Power_Cycle_Count       0x0032   099   099   000    Old_age   Always       -       95
177 Wear_Leveling_Count     0x0013   099   099   000    Pre-fail  Always       -       2
179 Used_Rsvd_Blk_Cnt_Tot   0x0013   100   100   010    Pre-fail  Always       -       0
181 Program_Fail_Cnt_Total  0x0032   100   100   010    Old_age   Always       -       0
182 Erase_Fail_Count_Total  0x0032   100   100   010    Old_age   Always       -       0
183 Runtime_Bad_Block       0x0013   100   100   010    Pre-fail  Always       -       0
187 Uncorrectable_Error_Cnt 0x0032   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0032   072   059   000    Old_age   Always       -       28
195 ECC_Error_Rate          0x001a   200   200   000    Old_age   Always       -       0
199 CRC_Error_Count         0x003e   098   098   000    Old_age   Always       -       1079
235 POR_Recovery_Count      0x0012   099   099   000    Old_age   Always       -       5
241 Total_LBAs_Written      0x0032   099   099   000    Old_age   Always       -       1814778529

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
  256        0    65535  Read_scanning was never started
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Last edited by k.yepes; 12-02-2020 at 03:43 PM.
 
Old 12-02-2020, 03:29 PM   #11
JockVSJock
Senior Member
 
Registered: Jan 2004
Posts: 1,420
Blog Entries: 4

Rep: Reputation: 164Reputation: 164
Be easier to read if you placed the output in code tags.

Code:
much easier to read
According to the output, seems like the SSD is healthy.
 
Old 12-03-2020, 01:52 AM   #12
ondoho
LQ Addict
 
Registered: Dec 2013
Posts: 19,872
Blog Entries: 12

Rep: Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053
Quote:
Originally Posted by k.yepes View Post
It's sda6...
FWIW...
This is just one partition on the physical device sda.
You should boot into a live environment (or otherwise make sure no partitions on /dev/sda are mounted) and run
Code:
# smartctl -a /dev/sda
IIRC this can take a very long time.
https://dt.iki.fi/hard-drive-and-fs-health-checks
 
Old 12-06-2020, 05:23 AM   #13
k.yepes
LQ Newbie
 
Registered: Nov 2020
Distribution: mint
Posts: 25

Original Poster
Rep: Reputation: Disabled
Code:
 sudo smartctl -a /dev/sda6
[sudo] password for user1:            
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-56-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Samsung based SSDs
Device Model:     Samsung SSD 860 EVO 500GB
Serial Number:    S598NZFN800834J
LU WWN Device Id: 5 002538 ec08d51c0
Firmware Version: RVT04B6Q
User Capacity:    500,107,862,016 bytes [500 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-4 T13/BSR INCITS 529 revision 5
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 1.5 Gb/s)
Local Time is:    Sun Dec  6 04:22:46 2020 CST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(    0) seconds.
Offline data collection
capabilities: 			 (0x53) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					No Offline surface scan supported.
					Self-test supported.
					No Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 (  85) minutes.
SCT capabilities: 	       (0x003d)	SCT Status supported.
					SCT Error Recovery Control supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0032   099   099   000    Old_age   Always       -       910
 12 Power_Cycle_Count       0x0032   099   099   000    Old_age   Always       -       97
177 Wear_Leveling_Count     0x0013   099   099   000    Pre-fail  Always       -       2
179 Used_Rsvd_Blk_Cnt_Tot   0x0013   100   100   010    Pre-fail  Always       -       0
181 Program_Fail_Cnt_Total  0x0032   100   100   010    Old_age   Always       -       0
182 Erase_Fail_Count_Total  0x0032   100   100   010    Old_age   Always       -       0
183 Runtime_Bad_Block       0x0013   100   100   010    Pre-fail  Always       -       0
187 Uncorrectable_Error_Cnt 0x0032   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0032   072   059   000    Old_age   Always       -       28
195 ECC_Error_Rate          0x001a   200   200   000    Old_age   Always       -       0
199 CRC_Error_Count         0x003e   098   098   000    Old_age   Always       -       1131
235 POR_Recovery_Count      0x0012   099   099   000    Old_age   Always       -       5
241 Total_LBAs_Written      0x0032   099   099   000    Old_age   Always       -       1855084265

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
  256        0    65535  Read_scanning was never started
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
 
Old 12-06-2020, 03:47 PM   #14
k.yepes
LQ Newbie
 
Registered: Nov 2020
Distribution: mint
Posts: 25

Original Poster
Rep: Reputation: Disabled
I called newegg today and honestly I feel like the gentleman I spoke to had no clue. They basically told me, "well- since you didn't buy our warranty you need to call samsung, here 1800-726-7864". I called them (not being sure if they were open because it is a Sunday) and the automated robot transferred me to a SEAGATE recording (I did not know SAMSUNG drives were SEAGATE??!?) where they told me they were closed. I will call them back tomorrow and see what they say.
 
Old 12-06-2020, 03:51 PM   #15
k.yepes
LQ Newbie
 
Registered: Nov 2020
Distribution: mint
Posts: 25

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by kilgoretrout View Post
Please post the output of:
Code:
$ systemctl status fstrim.timer
Since you cloned from a standard hard drive, I'm curious as to whether the ssd trim script, fstrim.timer, is being periodically run. It needs to be for an SSD or performance will deteriorate and it may not be running since you originally installed to a standard hard drive. If this is the cause, the next time the problem arises run:
Code:
$ sudo fstrim -a -v
That will manually run fstrim on the drive and if that corrects the problem, you have found the culprit. Also, post the output of:
Code:
$ lsblk -f
and:
Code:
$ df -h
That will give some idea of your partition layout, filesystems used and how full your drive is.
Code:
user1@machine:~$ lsblk -f
NAME   FSTYPE LABEL           UUID                                 FSAVAIL FSUSE% MOUNTPOINT
sda                                                                               
├─sda1 ntfs   win_7_c         82DC7A6FDC7A5CFB                                    
├─sda2                                                                            
├─sda5 swap                   acc1a65f-e3b3-4847-9c4e-35e47fdcf745                [SWAP]
└─sda6 ext4                   3ea32215-948a-4b46-b77b-f6a2597fd5e6   83.1G    50% /
sdb                                                                               
└─sdb1 ntfs   sata_500gb_dump 14E2B2B0E2B29608                                    
sdc                                                                               
└─sdc1 ntfs   ide_500gb_dump  E6ACF904ACF8CFD3                                    
sr0                                                                               
user1@machine:~$ df -h
Filesystem      Size  Used Avail Use% Mounted on
udev            3.9G     0  3.9G   0% /dev
tmpfs           797M  1.4M  795M   1% /run
/dev/sda6       184G   91G   84G  53% /
tmpfs           3.9G  187M  3.8G   5% /dev/shm
tmpfs           5.0M  4.0K  5.0M   1% /run/lock
tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup
tmpfs           797M   24K  797M   1% /run/user/1000
user1@machine:~$
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Fedora keeps going into read-only mode: is my ssd on its way out? Erdward Linux - Newbie 15 12-31-2018 09:47 PM
kvm natted network going offline at random times krazyivan Linux - Server 2 06-08-2014 07:19 PM
[SOLVED] fstab options for "write once, read many times" partition? tigerflag Linux - Hardware 6 07-10-2013 12:47 PM
Random hardfreeze at random times - Dell e6500 Capt_Krill Slackware 5 05-07-2012 10:30 PM
crontab once per week and once per month? qwertyjjj Linux - Newbie 3 12-15-2011 05:54 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware

All times are GMT -5. The time now is 10:59 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration