LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware
User Name
Password
Slackware This Forum is for the discussion of Slackware Linux.

Notices


Reply
  Search this Thread
Old 11-05-2016, 06:20 AM   #16
Olek
Member
 
Registered: Jul 2012
Location: Wroclaw Poland
Distribution: Slackware
Posts: 110

Rep: Reputation: 27

Make
Code:
#smartctl -t long /dev/sda
After this command, You will get information about when this test end.
By example my 3TB disk test takes about 5 hours.

After end of test make
Code:
smartctl -a /dev/sda
and you will see real number of pending sectors.
 
1 members found this post helpful.
Old 11-05-2016, 09:05 AM   #17
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,779

Rep: Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212
That increase in the pending sector count doesn't necessarily mean that anything changed. A bad sector won't be discovered and marked "pending" until something tries to read it.

I have to wonder, though, whether something might have turned off the drive's automatic defect management. That would explain the write error on the bad sector. I thought that modern drives no longer had the ability to turn that off, but perhaps yours is one of the exceptions. See the paragraph for the "-D" option in the hdparm manpage.
 
Old 11-05-2016, 09:06 AM   #18
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,779

Rep: Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212
Quote:
Originally Posted by Olek View Post
Make
Code:
#smartctl -t long /dev/sda
After this command, You will get information about when this test end.
By example my 3TB disk test takes about 5 hours.

After end of test make
Code:
smartctl -a /dev/sda
and you will see real number of pending sectors.
Unfortunately, that test stops on the first error it encounters, so it won't uncover further bad sectors.
 
Old 11-05-2016, 12:59 PM   #19
atelszewski
Member
 
Registered: Aug 2007
Distribution: Slackware
Posts: 948

Original Poster
Rep: Reputation: Disabled
Hi,

For all of you SMART people (no pun intended :-)), after smartctl -t long (yes, I waited for the requested time before using -a switch):
Code:
$ smartctl -a /dev/sda
smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.4.29] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     HGST HTE721010A9E630
Serial Number:    JR10034M2Y2MXK
LU WWN Device Id: 5 000cca 8a8e967b0
Firmware Version: JB0OA3M0
User Capacity:    1,000,204,886,016 bytes [1.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      2.5 inches
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ATA8-ACS T13/1699-D revision 6
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sat Nov  5 18:49:12 2016 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      ( 121)	The previous self-test completed having
					the read element of the test failed.
Total time to complete Offline 
data collection: 		(   45) seconds.
Offline data collection
capabilities: 			 (0x5b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					No Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 170) minutes.
SCT capabilities: 	       (0x003d)	SCT Status supported.
					SCT Error Recovery Control supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   062    Pre-fail  Always       -       65536
  2 Throughput_Performance  0x0005   100   100   040    Pre-fail  Offline      -       0
  3 Spin_Up_Time            0x0007   127   127   033    Pre-fail  Always       -       2
  4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -       23
  5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail  Always       -       0
  8 Seek_Time_Performance   0x0005   100   100   040    Pre-fail  Offline      -       0
  9 Power_On_Hours          0x0012   094   094   000    Old_age   Always       -       2885
 10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       2
191 G-Sense_Error_Rate      0x000a   076   076   000    Old_age   Always       -       198415
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       0
193 Load_Cycle_Count        0x0012   100   100   000    Old_age   Always       -       3360
194 Temperature_Celsius     0x0002   181   181   000    Old_age   Always       -       33 (Min/Max 20/34)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       10
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       24
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -       0
223 Load_Retry_Count        0x000a   100   100   000    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: read failure       90%      2879         9548728
# 2  Short offline       Completed without error       00%      2783         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
--
Best regards,
Andrzej Telszewski
 
Old 11-05-2016, 01:21 PM   #20
Emerson
LQ Sage
 
Registered: Nov 2004
Location: Saint Amant, Acadiana
Distribution: Gentoo ~amd64
Posts: 7,661

Rep: Reputation: Disabled
Code:
1  Extended offline    Completed: read failure       90%      2879         9548728
Warranty. It failed at 10%.
 
Old 11-05-2016, 01:27 PM   #21
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,779

Rep: Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212
Quote:
Originally Posted by atelszewski View Post
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 90% 2879 9548728[/code]
As expected, the test found an error and stopped. This was less than 1% of the way through the 976762584 sectors of disk. Pointless.

If you really want to find out how many bad sectors there are, run
Code:
dd if=/dev/sda of=/dev/null bs=4k conv=noerror
and then look at the number of pending sectors. I do not recommend doing this before recovering whatever data you can. Beating on a dying disk just to see how bad it is is not productive, and can make the problems worse. Using ddrescue to make an image with the readable sectors would be a better alternative.
 
Old 11-05-2016, 02:25 PM   #22
atelszewski
Member
 
Registered: Aug 2007
Distribution: Slackware
Posts: 948

Original Poster
Rep: Reputation: Disabled
Hi,

Just a side question.
Would it be wise to go with 2 SSD-s in RAID-1 configuration?
That's probably something that I could afford from the monetary point of view.

Please note that it's my favorite toy machine.
I want it to be the best possible, within sensible budget.
Loss of data wouldn't cause major injuries, and there are backups too.
It just feels better with the uptime ticking up continuously :-)

--
Best regards,
Andrzej Telszewski
 
Old 11-05-2016, 02:30 PM   #23
Emerson
LQ Sage
 
Registered: Nov 2004
Location: Saint Amant, Acadiana
Distribution: Gentoo ~amd64
Posts: 7,661

Rep: Reputation: Disabled
RAID-1 is for read speed. No redundancy really. Isn't SSD already fast enough for you?
 
Old 11-05-2016, 02:36 PM   #24
atelszewski
Member
 
Registered: Aug 2007
Distribution: Slackware
Posts: 948

Original Poster
Rep: Reputation: Disabled
Hi,

Quote:
Originally Posted by Emerson View Post
RAID-1 is for read speed. No redundancy really. Isn't SSD already fast enough for you?
Have I misunderstood Wiki?
Aren't there two copies?

--
Best regards,
Andrzej Telszewski
 
Old 11-05-2016, 03:29 PM   #25
Emerson
LQ Sage
 
Registered: Nov 2004
Location: Saint Amant, Acadiana
Distribution: Gentoo ~amd64
Posts: 7,661

Rep: Reputation: Disabled
Two copies, yes. One gets corrupted the other one gets corrupted, too. Only in case one drive dies suddenly the other one will have the data intact.
 
Old 11-05-2016, 03:33 PM   #26
atelszewski
Member
 
Registered: Aug 2007
Distribution: Slackware
Posts: 948

Original Poster
Rep: Reputation: Disabled
Hi,

Quote:
Originally Posted by Emerson View Post
Two copies, yes. One gets corrupted the other one gets corrupted, too. Only in case one drive dies suddenly the other one will have the data intact.
OK, that's what I was afraid of when I read about RAID-1.
So I would need something with error correction.
I'm goon have a look at the possibilities, but most probably I'm gonna give up on the idea.

Thanks.

--
Best regards,
Andrzej Telszewski
 
Old 11-05-2016, 04:09 PM   #27
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,779

Rep: Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212
RAID-1 will protect against data loss due to a drive failure. That is one cause of data loss. There is no form of RAID that protects against the other causes of data loss, such as accidental deletion, overwriting, OS failures that corrupt the filesystem, etc. RAID is not a substitute for backups. And of course RAID adds its own complexity and modes of failure to the mix. Its primary function is to allow a system to keep running seamlessly while a failed drive is replaced. If that is important vs. the hours of down time while a failed drive is replaced and restored from backup, then you need RAID. Otherwise, not so much, aside from the bragging rights about your continuous uptime (assuming that your drives are hot-swappable -- which they probably are not).
 
Old 11-07-2016, 11:39 AM   #28
atelszewski
Member
 
Registered: Aug 2007
Distribution: Slackware
Posts: 948

Original Poster
Rep: Reputation: Disabled
Hi,

There was no possibility to upgrade the hardware of this server.
I changed to the same class one, with 250GB SSD.
2 moving parts less to wear out ;-)

--
Best regards,
Andrzej Telszewski
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Help Debugging some logical errors in program NEWBIE codex96 Programming 6 02-16-2016 06:13 PM
Help needed for debugging a Dovecot IMAP + Kmail issue manttila Linux - Server 2 11-18-2013 02:13 PM
java script and debugging errors in Slack justwantin Slackware 2 02-22-2009 08:17 PM
Errors installing Q, which is needed for Lex, which is needed for PHP Virtuality Linux - Software 1 05-29-2007 04:47 PM
program logic comparison and debugging help needed frieza Programming 4 03-12-2004 11:14 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware

All times are GMT -5. The time now is 05:10 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration