LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware
User Name
Password
Linux - Hardware This forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?

Notices


Reply
  Search this Thread
Old 04-09-2019, 02:50 AM   #1
kaza
Member
 
Registered: Apr 2010
Distribution: FC17
Posts: 383

Rep: Reputation: 3
How can disk reporting badblocks>0 suddenly become badblocks==0?


Hello!

Few months ago I started getting I/O errors during full backup with "tar".
The errors were concentrated on specific files, I've run "badblocks"
(default, read-only mode with progress indication) on that disk (which is a pair
of SAS disks "TOSHIBA MK2001TRKB" bought in 2012, arranged as "hardware RAID 0"
array, controlled by Adaptec ICP5165BR controller and accessed as logical volume)
and got quite many bad blocks result.
That was a sign that the disk (at least one of the two) needs to be replaced
so I bought a pair of new SAS disks of same capacity, connected the new pair
to the same connector as the old pair and moved the old pair to another connector
of the RAID controller. After setting up the new array in an identical way as the old array
had been arranged the system returned to normal operation.
At that point I got curious: which of the two old disks (or both) has developed the bad blocks?
I deleted the old array, configured each of the two old disks as a separate volume, created
ext4 partition on it and formatted it. After that I've run the "badblocks" (again,
in default, read-only mode) on one of the fisks and got zero bad blocks. So I thought "ok,
all bad blocks are on the second disk", run "badblocks" on the second disk and was surprized
to get the same result of zero bad blocks on it too. That looked strange - I remember perfectly
that when I had the "I/O" errors during backup I've got nonzero (and quite high) number
of bad blocks. So just to check if connecting the disks as an array changes anything
I unmounted the two disks, set them again as "RAID 0" array, set them as logical volume
and run on it the "badblocks -sw /dev/sdc" - now I don't need the data on that disk so I can
use a destructive write mode. After 2-3 days of running I've for the same result: zero bad blocks.

Why is that? I thought that if magnetic surface deteriorates with time to the point of wrong data
being written/read that should stay. Other things that changed - the array is now connected to
another connector of the RAID controller. If there was a bad connection in the connector then
I would expect random I/O errors but they were on very specific files so that rules out
the bad contact at connector possibility.
So, what else might've been changed that suddenly caused the bad blocks to become "good"?
Before re-running "badblocks" on the old disks I thought to throw them away but now I'm not
sure - maybe I can still use them for few more years?

TIA,
kaza.
 
Old 04-09-2019, 04:56 AM   #2
business_kid
LQ Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 17,696

Rep: Reputation: 2661Reputation: 2661Reputation: 2661Reputation: 2661Reputation: 2661Reputation: 2661Reputation: 2661Reputation: 2661Reputation: 2661Reputation: 2661Reputation: 2661
You've reformatted, that's why.

Raid 0 has data striped between disks; the one version of it I came across in the wild had a third drive with synchronization data, but you didn't. You probably got filesystem errors or control errors rather than platter gouges. Everything except platter gouges gets wiped when you reformat.

Thge takeaway is that this didn't last like you thought it did. You need to regularly
  1. Check the integrity of those disks
  2. Then backup

If the integrity fails, restore the backup.
 
Old 04-09-2019, 06:45 PM   #3
kaza
Member
 
Registered: Apr 2010
Distribution: FC17
Posts: 383

Original Poster
Rep: Reputation: 3
Thanks for the info business_kid, I'll try the disk tools.

Tried gnome-disks/smartctl, doesn't looks like what I expected,
see a separate thread on it.

Thanks,
kaza.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
bind suddenly started reporting servfail on www.archives.gov dengel Linux - Networking 4 04-11-2017 08:41 AM
[SOLVED] df reporting 100% but du reporting 20% guna_pmk Linux - Server 1 04-15-2011 11:34 AM
Shouldn't "Slackware64" Become just "Slackware" and 32-bit Become "Slackware32"? foodown Slackware 6 06-23-2009 01:24 PM
Mandriva 2009 has suddenly become very slow and is constantly hitting the hard drive gdpr01 Linux - General 3 12-22-2008 04:59 PM
CDRom mounted ok and suddenly all files become Inaccessible. Josequi Linux - Hardware 0 03-28-2003 06:17 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware

All times are GMT -5. The time now is 07:04 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration