LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 09-28-2008, 09:22 PM   #1
Red Squirrel
Senior Member
 
Registered: Dec 2003
Distribution: Mint 17.1 KDE on workstation, CentOS 6.x on servers
Posts: 1,153

Rep: Reputation: 47
"faulty spare" do I have a bad HDD?


I have this one HDD in my new raid that has very long io wait time. It's been grinding my system to a halt as when a backup job kicks in the load goes up to like 6 so the system is unusable.

To rule out the slot I decided to remove it. So I just pulled it out and put it in another slot.

I then proceeded to do mdadm --manage --add /dev/md0 /dev/sde1

where sde1 is the new name (was sdc1 before).

But it takes a while to add then goes like this:

Code:
    Number   Major   Minor   RaidDevice State
       0       8       17        0      active sync   /dev/sdb1
       3       8       65        1      faulty spare rebuilding   /dev/sde1
       2       8       48        2      active sync   /dev/sdd

       4       8       33        -      faulty spare

Then this:


Code:
    Number   Major   Minor   RaidDevice State
       0       8       17        0      active sync   /dev/sdb1
       1       0        0        1      removed
       2       8       48        2      active sync   /dev/sdd

       3       8       65        -      faulty spare   /dev/sde1
       4       8       33        -      faulty spare

Does this mean the drive is dying so it can't be read properly?






Edit:

More stuff from dmessg... (probably easier if you just check from bottom)

Code:
ata7.00: cmd 35/00:58:e7:05:00/00:02:00:00:00/e0 tag 0 dma 307200 out
         res 51/84:49:e7:05:00/05:00:00:00:00/e0 Emask 0x10 (ATA bus error)
ata7.00: status: { DRDY ERR }
ata7.00: error: { ICRC ABRT }
ata7: hard resetting link
ata7: SATA link up 1.5 Gbps (SStatus 113 SControl 10)
ata7.00: configured for UDMA/33
ata7: EH complete
sd 6:0:0:0: [sde] 1953525168 512-byte hardware sectors (1000205 MB)
sd 6:0:0:0: [sde] Write Protect is off
sd 6:0:0:0: [sde] Mode Sense: 00 3a 00 00
sd 6:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
ata7.00: irq_stat 0x00020002, device error via D2H FIS
ata7.00: cmd 35/00:58:e7:05:00/00:02:00:00:00/e0 tag 0 dma 307200 out
         res 51/84:49:e7:05:00/05:00:00:00:00/e0 Emask 0x10 (ATA bus error)
ata7.00: status: { DRDY ERR }
ata7.00: error: { ICRC ABRT }
ata7: hard resetting link
ata7: SATA link up 1.5 Gbps (SStatus 113 SControl 10)
ata7.00: configured for UDMA/33
ata7: EH complete
sd 6:0:0:0: [sde] 1953525168 512-byte hardware sectors (1000205 MB)
sd 6:0:0:0: [sde] Write Protect is off
sd 6:0:0:0: [sde] Mode Sense: 00 3a 00 00
sd 6:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
ata7.00: irq_stat 0x00020002, device error via D2H FIS
ata7.00: cmd 35/00:58:e7:05:00/00:02:00:00:00/e0 tag 0 dma 307200 out
         res 51/84:49:e7:05:00/05:00:00:00:00/e0 Emask 0x10 (ATA bus error)
ata7.00: status: { DRDY ERR }
ata7.00: error: { ICRC ABRT }
ata7: hard resetting link
ata7: SATA link up 1.5 Gbps (SStatus 113 SControl 10)
ata7.00: configured for UDMA/33
ata7: EH complete
sd 6:0:0:0: [sde] 1953525168 512-byte hardware sectors (1000205 MB)
sd 6:0:0:0: [sde] Write Protect is off
sd 6:0:0:0: [sde] Mode Sense: 00 3a 00 00
sd 6:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
ata7.00: irq_stat 0x00020002, device error via D2H FIS
ata7.00: cmd 35/00:58:e7:05:00/00:02:00:00:00/e0 tag 0 dma 307200 out
         res 51/84:49:e7:05:00/05:00:00:00:00/e0 Emask 0x10 (ATA bus error)
ata7.00: status: { DRDY ERR }
ata7.00: error: { ICRC ABRT }
ata7: hard resetting link
ata7: SATA link up 1.5 Gbps (SStatus 113 SControl 10)
ata7.00: configured for UDMA/33
ata7: EH complete
sd 6:0:0:0: [sde] 1953525168 512-byte hardware sectors (1000205 MB)
sd 6:0:0:0: [sde] Write Protect is off
sd 6:0:0:0: [sde] Mode Sense: 00 3a 00 00
sd 6:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
ata7.00: irq_stat 0x00020002, device error via D2H FIS
ata7.00: cmd 35/00:58:e7:05:00/00:02:00:00:00/e0 tag 0 dma 307200 out
         res 51/84:3a:e7:05:00/05:00:00:00:00/e0 Emask 0x10 (ATA bus error)
ata7.00: status: { DRDY ERR }
ata7.00: error: { ICRC ABRT }
ata7: hard resetting link
ata7: softreset failed (SRST command error)
ata7: reset failed (errno=-5), retrying in 8 secs
ata7: hard resetting link
ata7: controller in dubious state, performing PORT_RST
ata7: softreset failed (timeout)
ata7: hard resetting link
ata7: softreset failed (timeout)
ata7: hard resetting link
ata7: softreset failed (timeout)
ata7: reset failed, giving up
ata7.00: disabled
sd 6:0:0:0: [sde] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK
sd 6:0:0:0: [sde] Sense Key : Aborted Command [current] [descriptor]
Descriptor sense data with sense descriptors (in hex):
        72 0b 47 00 00 00 00 0c 00 0a 80 00 00 00 00 00
        00 00 05 e7
sd 6:0:0:0: [sde] Add. Sense: Scsi parity error
end_request: I/O error, dev sde, sector 1511
ata7: EH complete
sd 6:0:0:0: [sde] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
end_request: I/O error, dev sde, sector 1953519935
end_request: I/O error, dev sde, sector 1953519935
md: super_written gets error=-5, uptodate=0
md: md0: recovery done.
RAID5 conf printout:
 --- rd:3 wd:2
 disk 0, o:1, dev:sdb1
 disk 1, o:0, dev:sde1
 disk 2, o:1, dev:sdd
RAID5 conf printout:
 --- rd:3 wd:2
 disk 0, o:1, dev:sdb1
 disk 2, o:1, dev:sdd
md: unbind<sde1>
md: export_rdev(sde1)
program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO
sd 6:0:0:0: [sde] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
end_request: I/O error, dev sde, sector 0
Buffer I/O error on device sde, logical block 0
Buffer I/O error on device sde, logical block 1
Buffer I/O error on device sde, logical block 2
Buffer I/O error on device sde, logical block 3
sd 6:0:0:0: [sde] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
end_request: I/O error, dev sde, sector 0
Buffer I/O error on device sde, logical block 0
sd 6:0:0:0: [sde] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
end_request: I/O error, dev sde, sector 1953525160
Buffer I/O error on device sde, logical block 244190645
sd 6:0:0:0: [sde] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
end_request: I/O error, dev sde, sector 1953525160
Buffer I/O error on device sde, logical block 244190645
sd 6:0:0:0: [sde] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
end_request: I/O error, dev sde, sector 0
Buffer I/O error on device sde, logical block 0
Buffer I/O error on device sde, logical block 1
Buffer I/O error on device sde, logical block 2
sd 6:0:0:0: [sde] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
end_request: I/O error, dev sde, sector 0

Last edited by Red Squirrel; 09-28-2008 at 09:36 PM.
 
Old 09-29-2008, 10:58 AM   #2
archtoad6
Senior Member
 
Registered: Oct 2004
Location: Houston, TX (usa)
Distribution: MEPIS, Debian, Knoppix,
Posts: 4,727
Blog Entries: 15

Rep: Reputation: 234Reputation: 234Reputation: 234
I think you're right.

Have you used its SMART (man:/smartctl in Konqueror) capabilities to check it out?

Some folks have had good luck w/ SpinRite (grc.com). I have tried it, but either it's not what it's touted to be or my drive was too far gone. SpinRite pros & cons:
  • Cons:
    • Proprietary
    • $$$
    • Capabilities may over hyped
  • Pros:
    • Honor system
    • Free trial
    • Money-back guarantee
I've never tried this, but I believe that zeroing the entire drive w/ dd might trigger its internal sector replacement mechanism.

Even if any of these work, do you really trust it w/ your data?

BTW, do you have a spare? If so, I would get it into the array ASAP -- if an other drive fails, you're hosed.
 
Old 09-29-2008, 03:21 PM   #3
Red Squirrel
Senior Member
 
Registered: Dec 2003
Distribution: Mint 17.1 KDE on workstation, CentOS 6.x on servers
Posts: 1,153

Original Poster
Rep: Reputation: 47
Here's something weird, I put it in another drive array and it seemed to pick it up fine, I put it in it's original and it came up as a different letter (had the old letters still registered even though I pulled the drive out). it rebuilt fine. then I rebooted, it did not reconize it anymore as the drive letter changed back. It's rebuilding now. so this is really weird. I'm hoping its the drive and not the backplane though...


Oh and also I did a smart test on it but whenever I tried to see the results it just kept saying the device does not support it. But it's a fairly modern hard drive (seagate barracuda ST31000340AS) so it does support smart. It is enabled as well. When I do smartctl --all it gives me info and such.

Last edited by Red Squirrel; 09-29-2008 at 04:52 PM.
 
Old 09-29-2008, 06:23 PM   #4
Bruce Hill
HCL Maintainer
 
Registered: Jun 2003
Location: McCalla, AL, USA
Distribution: Gentoo (all servers at work are openSUSE)
Posts: 6,938

Rep: Reputation: 128Reputation: 128
Perhaps I'm wrong, as I've never had a failed drive and had to
rebuild my array. However, reading "man mdadm", it sounds
as if you didn't follow the proper steps to remove the drive,
then re-add it, then rebuild.

You should download and run SeaTools on the drive, especially
since it has a 5-year limited warranty.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
"bad interpreter : no such file or directory" when configure "flex" acer_peri Linux - Software 10 11-10-2010 02:19 AM
Bad mount of .mdf - "wrong fs type, bad option, bad superblock, on /dev/loop0" Maybe-not Linux - General 2 02-29-2008 02:30 PM
RAID 5 with mdadm "spare" and "active sync" confusion ufmale Linux - Server 1 12-08-2007 11:31 AM
Can't mount samba server: "wrong fs type, bad option, bad superblock ..." Arla Linux - Networking 5 06-10-2007 02:53 PM
mysql "max spare servers"???? bulliver Linux - Software 6 08-17-2003 03:25 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 10:31 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration