LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Hardware (https://www.linuxquestions.org/questions/linux-hardware-18/)
-   -   Potential HDD issue/failure (still unsure after troubleshooting...) (https://www.linuxquestions.org/questions/linux-hardware-18/potential-hdd-issue-failure-still-unsure-after-troubleshooting-4175479403/)

ambius 10-02-2013 11:14 PM

Potential HDD issue/failure (still unsure after troubleshooting...)
 
Hello all!

I'm having an issue with trying to backup an older 500GB HDD (/dev/sdc) to a newer 2TB Seagate drive (/dev/sdb) simply using the following:

pv /dev/sdc | dd of=./where_sdb_is_mounted/backup.img conv=notrunc

About 29Gs into the operation, it suddenly quit on me. When I tried to ls the mounted directory, I get the following error.

Code:

ls: reading directory .: Input/output error
I also find this in dmesg (mind you, this is just a snippet):

Code:

[10995.659351] Buffer I/O error on device sdb1, logical block 7319543
[10995.659357] EXT4-fs warning (device sdb1): ext4_end_bio:286: I/O error writing to inode 12 (offset 28017795072 size 126976 starting block 7319800)
[10995.659369] Buffer I/O error on device sdb1, logical block 7319544
[10995.659373] Buffer I/O error on device sdb1, logical block 7319545
[10995.659377] Buffer I/O error on device sdb1, logical block 7319546
[10995.659381] Buffer I/O error on device sdb1, logical block 7319547
[10995.659385] Buffer I/O error on device sdb1, logical block 7319548
[10995.659389] Buffer I/O error on device sdb1, logical block 7319549
[10995.659393] Buffer I/O error on device sdb1, logical block 7319550
[10995.659397] Buffer I/O error on device sdb1, logical block 7319551
[10995.659403] EXT4-fs warning (device sdb1): ext4_end_bio:286: I/O error writing to inode 12 (offset 28017917952 size 32768 starting block 7319808)
[10995.659634] Aborting journal on device sdb1-8.
[10995.659651] JBD2: Error -5 detected when updating journal superblock for sdb1-8.
[10995.659836] EXT4-fs (sdb1): delayed block allocation failed for inode 12 at logical offset 6840320 with max blocks 2048 with error -30
[10995.659839] EXT4-fs (sdb1): This should not happen!! Data will be lost
[10995.660377] sd 6:0:0:0: [sdb] Unhandled error code
[10995.660380] sd 6:0:0:0: [sdb] 
[10995.660383] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[10995.660385] sd 6:0:0:0: [sdb] CDB:
[10995.660388] Write(10): 2a 00 03 7d 19 d0 00 00 f0 00
[10995.660397] end_request: I/O error, dev sdb, sector 58530256
[10995.660400] Buffer I/O error on device sdb1, logical block 7316026
[10995.660404] Buffer I/O error on device sdb1, logical block 7316027
[10995.660406] Buffer I/O error on device sdb1, logical block 7316028
[10995.660409] Buffer I/O error on device sdb1, logical block 7316029
[10995.660412] Buffer I/O error on device sdb1, logical block 7316030
[10995.660414] Buffer I/O error on device sdb1, logical block 7316031
[10995.660420] Buffer I/O error on device sdb1, logical block 7316032
[10995.660423] Buffer I/O error on device sdb1, logical block 7316033
[10995.660426] Buffer I/O error on device sdb1, logical block 7316034
[10995.660428] Buffer I/O error on device sdb1, logical block 7316035
[10995.660431] Buffer I/O error on device sdb1, logical block 7316036
[10995.660433] Buffer I/O error on device sdb1, logical block 7316037
[10995.660436] Buffer I/O error on device sdb1, logical block 7316038
[10995.660439] Buffer I/O error on device sdb1, logical block 7316039
[10995.660441] Buffer I/O error on device sdb1, logical block 7316040
[10995.660444] Buffer I/O error on device sdb1, logical block 7316041
[10995.660446] Buffer I/O error on device sdb1, logical block 7316042
[10995.660451] Buffer I/O error on device sdb1, logical block 7316043
[10995.660456] Buffer I/O error on device sdb1, logical block 7316044
[10995.660461] Buffer I/O error on device sdb1, logical block 7316045
[10995.660467] Buffer I/O error on device sdb1, logical block 7316046
[10995.660472] Buffer I/O error on device sdb1, logical block 7316047
[10995.660477] Buffer I/O error on device sdb1, logical block 7316048
[10995.660482] Buffer I/O error on device sdb1, logical block 7316049
[10995.660487] Buffer I/O error on device sdb1, logical block 7316050
[10995.660491] Buffer I/O error on device sdb1, logical block 7316051
[10995.660496] Buffer I/O error on device sdb1, logical block 7316052
[10995.660501] Buffer I/O error on device sdb1, logical block 7316053
[10995.660504] EXT4-fs error (device sdb1) in ext4_da_writepages:2576: Journal has aborted
[10995.660511] Buffer I/O error on device sdb1, logical block 7316054
[10995.660516] Buffer I/O error on device sdb1, logical block 7316055
[10995.660523] EXT4-fs warning (device sdb1): ext4_end_bio:286: I/O error writing to inode 12 (offset 28003508224 size 126976 starting block 7316312)
[10995.660539] EXT4-fs (sdb1): previous I/O error to superblock detected
[10995.660551] EXT4-fs error (device sdb1): __ext4_journal_start_sb:62: Detected aborted journal
[10995.660567] EXT4-fs (sdb1): Remounting filesystem read-only
[10995.660573] EXT4-fs (sdb1): previous I/O error to superblock detected
[10995.660583] EXT4-fs (sdb1): ext4_da_writepages: jbd2_start: 26944 pages, ino 12; err -30
[10995.661847] sd 6:0:0:0: [sdb] Synchronizing SCSI cache
[10995.661891] sd 6:0:0:0: [sdb] 
[10995.661894] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[10995.686004] systemd-journald[188]: Got invalid event from epoll.
[10995.735577] usb 2-1.1: new high-speed USB device number 6 using ehci-pci
[10995.880460] usb 2-1.1: New USB device found, idVendor=0bc2, idProduct=a0a4
[10995.880467] usb 2-1.1: New USB device strings: Mfr=2, Product=3, SerialNumber=1
[10995.880470] usb 2-1.1: Product: USB
[10995.880473] usb 2-1.1: Manufacturer: Seagate
[10995.880475] usb 2-1.1: SerialNumber: 2HC015KJ
[10995.881783] usb-storage 2-1.1:1.0: USB Mass Storage device detected
[10995.881860] scsi8 : usb-storage 2-1.1:1.0
[10996.043181] EXT4-fs error (device sdb1): ext4_find_entry:1309: inode #2: comm pool: reading directory lblock 0
[10996.043205] EXT4-fs error (device sdb1): ext4_find_entry:1309: inode #2: comm pool: reading directory lblock 0

Also, a quick look at blkid shows that sdb has now been detected as sdd!

A quick check with: smartctl -H /dev/sdd shows:
Code:

SMART overall-health self-assessment test result: PASSED
This has happened to me twice (although, not necessarily after 29Gs of copying). After rebooting the pc, the drive is properly detected as sdb again...

but I guess my problem with this is that I can't tell whether this is related to some kind of drive failure or usb controller failure, or something else (possibly linux related...).

Anyone who might have a clue as to what is going on, your input would be greatly appreciated.

uname = 3.10.11-200.fc19.x86_64

syg00 10-03-2013 04:51 AM

Looks like the USB drive is "going away" - too much draw against the interface maybe. Is it separately powered ?. If not, can you make it so ?.

When doing something like this I prefer to attach the device directly to the internal bus rather than USB. But if you must use USB, try a separate power feed.

H_TeXMeX_H 10-03-2013 04:55 AM

Try running a SMART long test on it:
Code:

smartctl -t long /dev/sdb
Then wait for it to finish and post the results:
Code:

smartctl -a /dev/sdb

ambius 10-04-2013 08:22 PM

Thanks for the replies!

syg00, thanks for the suggestion. Yeah, I guess it would make more sense to directly connect it to an internal bus. Normally, I think that's what I would have done, but I don't have a tower at the moment... but thanks for the reply. I didn't think of power issues, but the more I think about the problem, the more that makes sense that it's a possibility. Even though the enclosures are separately connected to power, they're connected to power bars that have a few more devices plugged into them than I would normally, comfortably admit to.

H_TeXMeX_H, thanks for the reply! In the end, I didn't bother with the long test as it said it would probably take about half a day to complete. I figured, given the nature of the problem, the wait time might be absurd considering that I've haven't had an issue with the drive yet. Certainly, that doesn't mean there isn't a problem, but I think I'm willing to take the risk.

My solution was to copy the user files using rsync using data verification options. At least I know they were written to the newer disk without any problems. It doesn't mean I can restore the OS the way I would like too... but I'm fine with having the cake; I guess I don't need to eat it too...

Thanks again!


All times are GMT -5. The time now is 03:30 AM.