Hello,
I was using Terminal and browsing a directory in my home folder. My "home" directory is located on "/dev/sdb1".
When in Terminal I typed "ls" in one of my directories and the output was garbage. The output didn't show the files in the directory. I think it said something like, "input/output error". Unfortunately, I didn't write the exact error down. Instead I rebooted.
The hard disk with the problem is:
Code:
$ sudo hdparm -I /dev/sdb
[sudo] password for brian:
/dev/sdb:
ATA device, with non-removable media
Model Number: WDC WD5000KS-00MNB0
Serial Number: WD-WCANU1019633
Firmware Revision: 07.02E07
Standards:
Supported: 7 6 5 4
Likely used: 8
Configuration:
Logical max current
cylinders 16383 16383
heads 16 16
sectors/track 63 63
--
CHS current addressable sectors: 16514064
LBA user addressable sectors: 268435455
LBA48 user addressable sectors: 976773168
Logical/Physical Sector size: 512 bytes
device size with M = 1024*1024: 476940 MBytes
device size with M = 1000*1000: 500107 MBytes (500 GB)
cache/buffer size = 16384 KBytes
Capabilities:
LBA, IORDY(can be disabled)
Queue depth: 32
Standby timer values: spec'd by Standard, with device specific minimum
R/W multiple sector transfer: Max = 16 Current = 8
Recommended acoustic management value: 128, current value: 128
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=120ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
* SMART feature set
Security Mode feature set
* Power Management feature set
* Write cache
* Look-ahead
* Host Protected Area feature set
* WRITE_BUFFER command
* READ_BUFFER command
* NOP cmd
* DOWNLOAD_MICROCODE
Power-Up In Standby feature set
* SET_FEATURES required to spinup after power up
SET_MAX security extension
* Automatic Acoustic Management feature set
* 48-bit Address feature set
* Device Configuration Overlay feature set
* Mandatory FLUSH_CACHE
* FLUSH_CACHE_EXT
* SMART error logging
* SMART self-test
* General Purpose Logging feature set
* WRITE_{DMA|MULTIPLE}_FUA_EXT
* 64-bit World wide name
* Gen1 signaling speed (1.5Gb/s)
* Gen2 signaling speed (3.0Gb/s)
* Native Command Queueing (NCQ)
* Host-initiated interface power management
* Phy event counters
* DMA Setup Auto-Activate optimization
* Software settings preservation
* SMART Command Transport (SCT) feature set
* SCT Long Sector Access (AC1)
* SCT LBA Segment Access (AC2)
* SCT Error Recovery Control (AC3)
* SCT Features Control (AC4)
* SCT Data Tables (AC5)
unknown 206[12] (vendor specific)
Security:
Master password revision code = 65534
supported
not enabled
not locked
frozen
not expired: security count
not supported: enhanced erase
138min for SECURITY ERASE UNIT.
Logical Unit WWN Device Identifier: 50014ee20002257a
NAA : 5
IEEE OUI : 0014ee
Unique ID : 20002257a
Checksum: correct
uname output:
Code:
$ uname -r
2.6.32-5-amd64
lsb_release output:
Code:
$ lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description: Debian GNU/Linux 6.0.1 (squeeze)
Release: 6.0.1
Codename: squeeze
During the reboot my computer was unable to mount my "home" directory located on "/dev/sdb1".
But, I was able to see my other devices. During the reboot I saw a message that said something like "fsck unable to resolve: 'UUID=0f24fae1-135c-4750-9928-4632e2f04f45'". That's the UUID of my "home" directory located on "/dev/sdb1".
fstab output:
Code:
$ cat /etc/fstab
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point> <type> <options> <dump> <pass>
proc /proc proc defaults 0 0
# / was on /dev/sda2 during installation
UUID=f13fe524-b8ae-4f35-8831-9ba9e9db2dfa / ext4 errors=remount-ro 0 1
# /home was on /dev/sdb1 during installation
UUID=0f24fae1-135c-4750-9928-4632e2f04f45 /home ext4 defaults 0 2
# /wd500 was on /dev/sdc1 during installation
UUID=1fba7d0c-e82c-4837-b1bd-6192d7dd3f88 /wd500 ext4 rw,user,exec 0 2
# /wdgiga was on /dev/sdc3 during installation
UUID=3baa432d-3480-402d-8df3-1b90dbc5f655 /wdgiga ext4 rw,user,exec 0 2
# /wdtera was on /dev/sdc2 during installation
UUID=9a762875-5e0d-4edf-8bdd-8aaaea6403d5 /wdtera ext4 rw,user,exec 0 2
# /xtraSpace was on /dev/sda3 during installation
UUID=654d5c57-129b-4e06-92f1-673a8b4bcf56 /xtraSpace ext4 defaults 0 2
# swap was on /dev/sda1 during installation
UUID=f7fc69af-d475-44f1-87dd-63eb8ca0b7ed none swap sw 0 0
/dev/scd1 /media/cdrom0 udf,iso9660 user,noauto 0 0
/dev/scd0 /media/cdrom1 udf,iso9660 user,noauto 0 0
#/dev/sdc1 /media/usb0 auto rw,user,noauto 0 0
#/dev/sdc2 /media/usb1 auto rw,user,noauto 0 0
#/dev/sdc3 /media/usb2 auto rw,user,noauto 0 0
I was able to boot but had no home directory (all my stuff was backed up). I decided to reboot using the SystemRescueCd (
www.sysresccd.org). I ran "FSArchiver: Filesystem Archiver for Linux". You can see from the output that it didn't see my
"home" directory (which usually would mount on "/dev/sdb1").
The output is below:
Code:
=====================>>> fsarchiver probe simple <<<=====================
[======DISK======] [=============NAME==============] [====SIZE====] [MAJ] [MIN]
[sda ] [WDC WD800JD-75MS ] [ 74.51 GB] [ 8] [ 0]
[sdb ] [My Book 1130 ] [ 1.82 TB] [ 8] [ 16]
[=====DEVICE=====] [==FILESYS==] [======LABEL======] [====SIZE====] [MAJ] [MIN]
[loop0 ] [squashfs ] [<unknown> ] [ 265.55 MB] [ 7] [ 0]
[sda1 ] [swap ] [<unknown> ] [ 1.53 GB] [ 8] [ 1]
[sda2 ] [ext4 ] [<unknown> ] [ 14.63 GB] [ 8] [ 2]
[sda3 ] [ext4 ] [<unknown> ] [ 58.35 GB] [ 8] [ 3]
[sdb1 ] [ext4 ] [wd500 ] [ 499.37 GB] [ 8] [ 17]
[sdb2 ] [ext4 ] [<unknown> ] [ 1.33 TB] [ 8] [ 18]
[sdb3 ] [ext4 ] [<unknown> ] [ 1.00 GB] [ 8] [ 19]
I also ran gparted but it didn't list my home directory. So, I figured my "home" directory (which usually would mount on "/dev/sdb1") was dead so I bought a replacement hard drive.
Then, I rebooted without using the SystemRescueCd and I saw this message scroll by, "/home: recovering journal".
I also saw that message when I looked in /var/log/fsck/checkfs:
Code:
$ cat /var/log/fsck/checkfs
Log of fsck -C -R -A -a
Tue Jun 21 15:51:12 2011
fsck from util-linux-ng 2.17.2
/dev/sda3: clean, 205/3825664 files, 15093148/15295744 blocks
wd500: clean, 201315/32727040 files, 118986212/130905644 blocks
/home: recovering journal
/dev/sdc3: clean, 12/65808 files, 12660/263064 blocks
/dev/sdc2: clean, 241520/89300992 files, 275299582/357201258 blocks
/home: Clearing orphaned inode 38781677 (uid=1000, gid=1000, mode=0100644, size=32768)
/home: Clearing orphaned inode 38780933 (uid=1000, gid=1000, mode=0100600, size=77192)
/home: clean, 202121/61063168 files, 119022216/122096000 blocks
Tue Jun 21 15:52:02 2011
----------------
And when I logged in my "home" directory located on "/dev/sdb1" was alive. Here's the current output of my disk space usage:
Code:
$ df -H
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 16G 12G 3.3G 79% /
tmpfs 1.9G 0 1.9G 0% /lib/init/rw
udev 1.9G 246k 1.9G 1% /dev
tmpfs 1.9G 4.1k 1.9G 1% /dev/shm
/dev/sdb1 493G 472G 21G 96% /home
/dev/sdc1 528G 471G 57G 90% /wd500
/dev/sdc3 1.1G 35M 1.1G 4% /wdgiga
/dev/sdc2 1.5T 1.2T 336G 77% /wdtera
/dev/sda3 62G 61G 830M 99% /xtraSpace
Below is very truncated output of a small portion from /var/log/messages that might be referring to the device that had problems ("/home on /dev/sdb1"). I don't know if it will be useful:
Code:
Jun 21 12:03:12 kub nagios3: Auto-save of retention data completed successfully.
Jun 21 12:31:24 kub kernel: [59330.816096] ata4: hard resetting link
Jun 21 12:31:29 kub kernel: [59336.180017] ata4: link is slow to respond, please be patient (ready=0)
Jun 21 12:31:34 kub kernel: [59340.828034] ata4: hard resetting link
Jun 21 12:31:39 kub kernel: [59346.188034] ata4: link is slow to respond, please be patient (ready=0)
Jun 21 12:31:44 kub kernel: [59350.836041] ata4: hard resetting link
Jun 21 12:31:49 kub kernel: [59356.196016] ata4: link is slow to respond, please be patient (ready=0)
Jun 21 12:32:19 kub kernel: [59385.876034] ata4: limiting SATA link speed to 1.5 Gbps
Jun 21 12:32:19 kub kernel: [59385.876039] ata4: hard resetting link
Jun 21 12:32:24 kub kernel: [59390.900032] ata4.00: disabled
Jun 21 12:32:24 kub kernel: [59390.900040] ata4.00: device reported invalid CHS sector 0
Jun 21 12:32:24 kub kernel: [59390.900044] ata4.00: device reported invalid CHS sector 0
Jun 21 12:32:24 kub kernel: [59390.900062] ata4: EH complete
Jun 21 12:32:24 kub kernel: [59390.900090] sd 3:0:0:0: [sdb] Unhandled error code
Jun 21 12:32:24 kub kernel: [59390.900093] sd 3:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Jun 21 12:32:24 kub kernel: [59390.900099] sd 3:0:0:0: [sdb] CDB: Read(10): 28 00 26 c8 48 1f 00 01 00 00
Jun 21 12:32:24 kub kernel: [59390.900139] sd 3:0:0:0: [sdb] Unhandled error code
Jun 21 12:32:24 kub kernel: [59390.900142] sd 3:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Jun 21 12:32:24 kub kernel: [59390.900146] sd 3:0:0:0: [sdb] CDB: Read(10): 28 00 26 c8 47 1f 00 01 00 00
Jun 21 12:32:24 kub kernel: [59390.998653] sd 3:0:0:0: [sdb] Unhandled error code
Jun 21 12:32:24 kub kernel: [59390.998659] sd 3:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Jun 21 12:32:24 kub kernel: [59390.998665] sd 3:0:0:0: [sdb] CDB: Read(10): 28 00 0a a2 13 2f 00 00 08 00
Jun 21 12:32:30 kub kernel: [59396.804145] sd 3:0:0:0: [sdb] Unhandled error code
Jun 21 12:32:30 kub kernel: [59396.804151] sd 3:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Jun 21 12:32:30 kub kernel: [59396.804157] sd 3:0:0:0: [sdb] CDB: Write(10): 2a 00 11 f2 46 bf 00 00 08 00
Jun 21 12:32:30 kub kernel: [59396.804181] lost page write due to I/O error on sdb1
Jun 21 12:32:30 kub kernel: [59396.804201] JBD2: Detected IO errors while flushing file data on sdb1-8
Jun 21 12:32:30 kub kernel: [59396.804213] sd 3:0:0:0: [sdb] Unhandled error code
Jun 21 12:32:30 kub kernel: [59396.804337] sd 3:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Jun 21 12:32:30 kub kernel: [59396.804342] sd 3:0:0:0: [sdb] CDB: Write(10): 2a 00 00 00 31 77 00 00 08 00
Jun 21 12:32:30 kub kernel: [59396.804363] lost page write due to I/O error on sdb1
My question is what should I do now? Using Linux how do I check my hard disk for errors?
Thank you for your advice.